RE: [SQLDownUnder] INNNER JOIN?


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
    • From: Greg Low
    • Subject: RE: [SQLDownUnder] INNNER JOIN?
    • Date: Wed, 09 Mar 2005 15:00:16 +1100

    Hi Victor & Aaron,
    
    As Victor mentioned, it depends upon the indexes in place. DISTINCT doesn't
    have to result in another step. If you look at the execution plans for the
    following (against the pubs database):
    
    SELECT pub_id FROM publishers
    WHERE NOT EXISTS(SELECT * FROM titles WHERE titles.pub_id =
    publishers.pub_id)
    
    SELECT pub_id FROM publishers
    WHERE pub_id NOT IN(SELECT DISTINCT pub_id FROM titles)
    
    SELECT pub_id FROM publishers
    WHERE pub_id NOT IN(SELECT pub_id FROM titles)
    
    the first produces a more complex plan but one that would be more efficient
    on a large table. The second and third produce identical plans that are
    simpler but not faster and I believe as the tables grow in size, much less
    so. In this case, the DISTINCT is able to make use of the clustered index. 
    
    Testing I've done on my own apps shows a clear difference on larger
    databases. (It's a bit hard in pubs where there are so few entries and a
    bunch of indexes).
    
    And as Victor also pointed out, a LEFT OUTER JOIN rule is appropriate. If
    you look at the execution plan for the first, you'll see that being applied
    by the optimiser even though we haven't structured the statement that way.
    
    Regards,
    
    Greg
    
    -----Original Message-----
    From: SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx
    [mailto:SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of
    Victor.Isakov@xxxxxxxxxxxxxxxxxxxx
    Sent: Wednesday, 9 March 2005 2:04 PM
    To: SQLDownUnder@xxxxxxxxxxxxxxxxxxxxxx
    Subject: RE: [SQLDownUnder] INNNER JOIN?
    
    Don't forget that an explicit DISTINCT potentially involves another step by
    the optimizer, so generally I would only use it only where required.
    DISTINCT operations are a special type of GROUP BY. This then depends on the
    number of duplicates you expect to get (density), etc. But generally I avoid
    DISTINCTs unless there is a real need to use them.
    
    But there is no need, strictly speaking. 
    
    Generally the parser will translate the IN into a whole bunch of ORs and (I
    assume, memory is shocking) would get rid of duplicate search arguments
    (SARGs). But in this case it's a sub-query, so it won't know the list until
    run-time. So it will have to generate a hashtable, etc in TEMPDB. Now if
    your subquery generates 1 million records but only with a domain of 1..5 you
    might be tempted to get rid of the duplicates by using DISTINCT, but then
    again it will use a B-Tree to get to the initial values quicker so there's
    no need. You'd have an extra step being performed and you would save space,
    but it still might not be worth it.
    
    So it's a trade-off between space and speed.
    
    I think ultimately the answer depends on whether you have an index on the
    column in question. If so, you might consider using a DISTINCT if you have a
    high density as it should not impact performace too much doing the distinct
    step. Otherwise it might not be worth it.
    
    I think it's very much a suck-and-see exercise...
    
    It will be interesting to see other's comments on this.
    
    
    Cheers,
    
    Victor
    -----Original Message-----
    From: SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx
    [mailto:SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Junying Xue
    Sent: Wednesday,9 March 2005 1:52 PM
    To: SQLDownUnder@xxxxxxxxxxxxxxxxxxxxxx
    Subject: RE: [SQLDownUnder] INNNER JOIN?
    
    Thanks Greg!  I just learned one more thing about SQL Server.
    
    One more thing, do I need to use "distinct" in the sub-query if I use "not
    in"?
    
    -----Original Message-----
    From: SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx
    [mailto:SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Greg Low
    Sent: Wednesday, 9 March 2005 1:41 PM
    To: SQLDownUnder@xxxxxxxxxxxxxxxxxxxxxx
    Subject: RE: [SQLDownUnder] INNNER JOIN?
    
    Absolutely. Not In requires a list to be created in the sub-query. It would
    be nice if the optimiser recognised it and made it the same but I wouldn't
    want to depend on that behaviour.
    
    EXISTS can exit as soon as anything that proves the sub-query is found. 
    
    Another little note as well: In SQL Server, you don't need to specify a
    column in the select statement in an EXISTS clause ie: it's the one
    situation where SELECT * is considered "good form", unlike what occurs in
    other databases.
    
    Regards,
    
    Greg
    
    -----Original Message-----
    From: SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx
    [mailto:SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Junying Xue
    Sent: Wednesday, 9 March 2005 12:20 PM
    To: SQLDownUnder@xxxxxxxxxxxxxxxxxxxxxx
    Subject: RE: [SQLDownUnder] INNNER JOIN?
    
    I know for sure that there is a performance difference between "not in"
    and "not exists" in Oracle.  How about in SQL Server?
    
    -----Original Message-----
    From: SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx
    [mailto:SQLDownUnderList@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of William Luu
    Sent: Wednesday, 9 March 2005 11:25 AM
    To: SQLDownUnder@xxxxxxxxxxxxxxxxxxxxxx
    Subject: Re: [SQLDownUnder] INNNER JOIN?
    
    Thanks Aaron, that worked.
    
    On Wed, 9 Mar 2005 11:16:06 +1100, Aaron Bull <aaron@xxxxxxxxxxxx>
    wrote:
    <snip /> 
    > select ItemID
    > from Items
    > where
    > ItemID not in (select distinct FK_ItemID from Products)
    > 
    > Cheers,
    > 
    > Aaron
    >
    
    
    to unsubscribe to this list, please send a message back to the list with
    'unsubscribe' as the subject. Powered by mailenable.com - List managed by
    http://www.readify.net
    
    
    
    
    
    to unsubscribe to this list, please send a message back to the list with
    'unsubscribe' as the subject. Powered by mailenable.com - List managed by
    http://www.readify.net
    
    
    
    
    to unsubscribe to this list, please send a message back to the list with
    'unsubscribe' as the subject. Powered by mailenable.com - List managed by
    http://www.readify.net
    
    
    
    
    
    to unsubscribe to this list, please send a message back to the list with
    'unsubscribe' as the subject. Powered by mailenable.com - List managed by
    http://www.readify.net
    
    
    
    
    
    
    This message and any attachment is confidential and may be privileged or
    otherwise protected from disclosure.  If you have received it by mistake
    please let us know by reply and then delete it from your system; you should
    not copy the message or disclose its contents to anyone.
    
    ============================================================================
    ==
    
    
    
    to unsubscribe to this list, please send a message back to the list with
    'unsubscribe' as the subject. Powered by mailenable.com - List managed by
    http://www.readify.net
    
    
    
    
    to unsubscribe to this list, please send a message back to the list with 'unsubscribe' as the subject. Powered by mailenable.com - List managed by http://www.readify.net
    
    
    



    (Click here for more information on the sqldownunder mailling list)