1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

SQL gurus, how can I make my query better?

Discussion in 'HTML, Graphics & Programming' started by HungryHippos, 4 Dec 2010.

  1. HungryHippos

    Sgarrista

    Joined: 25 Mar 2004

    Posts: 9,823

    Location: Fareham

    Hi guys,

    I'm trying to run a query to determine how much data a group of users has in our Zantaz EAS system, the problem is the query is taking a bloody long time to process, so I think it could use some optimisation. The databases are very large.

    How can this be sped up?

    This is what I currently have:

    Code:
    SELECT     USERS.USERNAME, USERS.USERID, USERS.EASSTATUS, SUM(PROFILELOCATION.COMPRESSEDSIZE) As COMPRESSED_SIZE, 
                          SUM(PROFILELOCATION.UNCOMPRESSEDSIZE) As UNCOMPRESSED_SIZE
    FROM         GROUPS INNER JOIN
                          USERS ON GROUPS.GROUPID = USERS.GROUPID INNER JOIN
                          REFER ON USERS.USERID = REFER.USERID INNER JOIN
                          PROFILELOCATION ON REFER.MSGID = PROFILELOCATION.MSGID
    WHERE GROUPS.GROUPID = 191 and REFER.FOLDERID > 0
    GROUP BY USERS.USERNAME, USERS.USERID, USERS.EASSTATUS
    ORDER BY USERS.USERNAME
    I've added my query designer view so you can better see how this is working, essentially I'm joining tables based on common fields.

    http://img94.imageshack.us/img94/7842/easquery.jpg

    So this works with 4 databases:

    GROUPS
    USERS
    REFER
    PROFILELOCATION

    GROUPS contains all the groups with GROUPID/GROUPNAME.

    USERS contains the list of users, each user has a GROUPID which can be linked to the GROUPS database to figure out which group a user belongs to. This database also lists the USERNAME and EASSTATUS values. If EASSTATUS = 0 then the user is enabled, 1 is disabled.

    REFER lists messages that belong to users

    PROFILELOCATION lists all the messages with compressed and uncompressed sizes

    Cheers ;)
     
  2. medicX

    Associate

    Joined: 4 Jul 2010

    Posts: 59

    Location: Nottingham

    Hello,

    Have you created some good indexes? I suggest these:

    1. USERS table, create index GROUPID
    2. REFER table, create index USERID
    3. PROFILELOCATION table, create index MSGID

    time the query as it is now and note down how long it takes, then create the above indexes and run the query once more. It should be speed by a bit.

    Some help on how to create indexes: http://www.1keydata.com/sql/sql-create-index.html

    Post back how it turns out.
     
  3. HungryHippos

    Sgarrista

    Joined: 25 Mar 2004

    Posts: 9,823

    Location: Fareham

    Thanks, i'll give it a crack. Is it possible to mess things up by creating too many indexes? I guess it would be. I'll make a note of the ones I add so I can remove them if necessary :)
     
  4. fez

    Capodecina

    Joined: 22 Aug 2008

    Posts: 17,012

    Location: Tunbridge Wells

    Indexing your primary and foreign keys will make queries much much quicker. The difference can be huge so I wouldnt be surprised if that solves a lot of your issues.
     
  5. Pho

    Sgarrista

    Joined: 18 Oct 2002

    Posts: 9,224

    Location: Derbyshire

    Post a photo of the actual execution plan (not the estimated one) for the query, it's probably the easiest way to see what parts of your query are slow. It's one of the buttons in the toolbar of management studio.

    http://msdn.microsoft.com/en-us/library/ms178071.aspx
     
  6. HungryHippos

    Sgarrista

    Joined: 25 Mar 2004

    Posts: 9,823

    Location: Fareham

    I added the indexes, but I think two of them were already on, the only one which wasn't was the GROUPID index on the USERS table. I also took some screen shots of the execution plan which I've put below.

    http://img255.imageshack.us/img255/8376/pic1f.png
    http://img152.imageshack.us/img152/676/pic2z.png

    It's quite long! looks like all the workload is taken up on two of the sections though which is interesting.

    Taking various lengths of time to do the same query here, one time it took 2.5 minutes, another 4.5, and the most recent took 10 minutes...!
     
  7. Goksly

    Capodecina

    Joined: 5 Mar 2003

    Posts: 10,641

    Location: Nottingham

    So are you happy that indexes are on all the join and primary key columns? If the query is run often then you can include the columns you are summing up on the index. This means there would be no key lookup shaving 50% of your time (at the cost of more storage space).

    Also varying run times can hint at incorrect statistics. Also check the indexes have been defragged. If up to the second accuracy is not important you can allow dirty reads - should make it more consistent and not block other users.
     
  8. HungryHippos

    Sgarrista

    Joined: 25 Mar 2004

    Posts: 9,823

    Location: Fareham

    We have a 2nd copy of the database for other reasons, i'm going to play around with that instead of adding more indexes to the live database.

    The query would only be run occasionally by us as administrators. It wouldn't be run under normal circumstances.

    The databases themselves won't change that frequently, we run tasks that go over night that update them but during the day when we would normally run queries like this, the data remains pretty consistent.
     
  9. Ranked1_of_1

    Hitman

    Joined: 3 Sep 2010

    Posts: 557

    Is there any chance you could post a picture of the indexes on the REFER table. Also why have you specified FOLDERID > 0. I'm not doubting your knowledge of the data, but interested in what circumstances a composite key column would have values less than one?
     
  10. Yakyb

    Gangster

    Joined: 8 Sep 2009

    Posts: 394

    i would firstly check for the Indexes in the Tables then check the fragmentation of them,

    failing that i would create a subquery

    essentially


    SELECT derivedtbl_1.UserName, derivedtbl_1.GroupID, derivedtbl_1.EASStatus, derivedtbl_2.COMPRESSED_SIZE, derivedtbl_2.UNCOMPRESSED_SIZE
    FROM (SELECT MSGID, SUM(PROFILELOCATION.COMPRESSEDSIZE) AS COMPRESSED_SIZE, SUM(PROFILELOCATION.UNCOMPRESSEDSIZE)
    AS UNCOMPRESSED_SIZE
    FROM PROFILELOCATION
    GROUP BY MSGID) AS derivedtbl_2 INNER JOIN
    (SELECT Users.UserID, Users.UserName, Users.GroupID, Users.EASStatus, Refer.MSGID
    FROM Users INNER JOIN
    Refer ON Refer.USERID = Users.UserID
    WHERE (Users.GroupID = 191) AND (Refer.FolderID > 0)) AS derivedtbl_1 ON derived_tbl1.MSGID = derivedtbl_1.MSGID

    or alternatively take derivedtbl_1 and tie it directly to Profilelocation

    (having just checked you original statement you may have a large number of groups so this may not help but could be worth a try)

    Also just noticed you don't really need the groups table to be joined you can place the parameter on Users.GroupID (< check for index)
     
    Last edited: 7 Dec 2010