Friday, March 9, 2012

Join vs Where performance

Consider the following hypothetical queries:
QUERY A
Select ...
>From TableA TA Join TableB TB
ON TA.Field1 = TB.Field1 AND
TA.Field2 = TB.Field2 AND
TA.Field3 = TB.Field3
Where
TA.Field4 = 'Some Value'
QUERY B
Select ...
>From TableA TA Join TableB TB
ON TA.Field1 = TB.Field1
Where
TA.Field4 = 'Some Value' AND
TA.Field2 = TB.Field2 AND
TA.Field3 = TB.Field3
Notice that the difference is that the Join clause in Query A has more
than one expression to evaluate but in Query B, the Join only has a
single expression and the other expressions have been moved to the
Where clause.
These queries should return the same results but my question is about
performance. What are the best practices regarding Joins vs the Where
clause? Do these queries have significantly differing performance?
Which is preferred and why?
Can someone point me to some articles or books that would help me to
under stand?
ThanksChris,
I don't think you will see any performance differences (generally) between
either method (INNER JOINS). I generally prefer to use JOINs for the join
criteria and WHERE for filtering. Using JOIN can help reduce the number of
cross joins when the WHERE clause is ommitted. Also, ANSI compliance when
using LEFT or RIGHT as opposed to *= or =*.
HTH
Jerry
"Chris Dunaway" <dunawayc@.gmail.com> wrote in message
news:1129744620.635243.276050@.o13g2000cwo.googlegroups.com...
> Consider the following hypothetical queries:
> QUERY A
> Select ...
> ON TA.Field1 = TB.Field1 AND
> TA.Field2 = TB.Field2 AND
> TA.Field3 = TB.Field3
> Where
> TA.Field4 = 'Some Value'
>
> QUERY B
> Select ...
> ON TA.Field1 = TB.Field1
> Where
> TA.Field4 = 'Some Value' AND
> TA.Field2 = TB.Field2 AND
> TA.Field3 = TB.Field3
>
> Notice that the difference is that the Join clause in Query A has more
> than one expression to evaluate but in Query B, the Join only has a
> single expression and the other expressions have been moved to the
> Where clause.
> These queries should return the same results but my question is about
> performance. What are the best practices regarding Joins vs the Where
> clause? Do these queries have significantly differing performance?
> Which is preferred and why?
> Can someone point me to some articles or books that would help me to
> under stand?
> Thanks
>|||hit CTRL + K
run both queries and look if the execution plans are different
http://sqlservercode.blogspot.com/
"Chris Dunaway" wrote:

> Consider the following hypothetical queries:
> QUERY A
> Select ...
> ON TA.Field1 = TB.Field1 AND
> TA.Field2 = TB.Field2 AND
> TA.Field3 = TB.Field3
> Where
> TA.Field4 = 'Some Value'
>
> QUERY B
> Select ...
> ON TA.Field1 = TB.Field1
> Where
> TA.Field4 = 'Some Value' AND
> TA.Field2 = TB.Field2 AND
> TA.Field3 = TB.Field3
>
> Notice that the difference is that the Join clause in Query A has more
> than one expression to evaluate but in Query B, the Join only has a
> single expression and the other expressions have been moved to the
> Where clause.
> These queries should return the same results but my question is about
> performance. What are the best practices regarding Joins vs the Where
> clause? Do these queries have significantly differing performance?
> Which is preferred and why?
> Can someone point me to some articles or books that would help me to
> under stand?
> Thanks
>|||I have done that and, being inexperienced in T-SQL, I do not fully
understand what the execution plans are telling me. My question was
more general than specific. The hypothetical queries I provided may
perform equivalently. I am more interested in best practices. In
general, when joining two table using JOIN, is it advantageous to have
more than one join expression? Also, would you ever put a constant
value in a join expression like this:
Select ...
>From TableA TA Join TableB TB
ON TA.Field1 = TB.Field1 AND
TA.Field2 = TB.Field2 AND
TA.Field3 = TB.Field3 AND
TA.Field5 = 'const expression'
Where
TA.Field4 = 'Some Value'
I want to make my queries as fast as possible (don't we all?) and I
want to make sure that I use practices that others have found to
reliably produce better results.
Can you recommend any books that will help me to learn more about how
Queries are optimized in SQL Server?
Thanks again.|||Inside Microsoft SQL Server 2000
by Kalen Delaney
http://www.amazon.com/exec/obidos/A...link%5Fcode=xm2
One of the best books
"Chris Dunaway" wrote:

> I have done that and, being inexperienced in T-SQL, I do not fully
> understand what the execution plans are telling me. My question was
> more general than specific. The hypothetical queries I provided may
> perform equivalently. I am more interested in best practices. In
> general, when joining two table using JOIN, is it advantageous to have
> more than one join expression? Also, would you ever put a constant
> value in a join expression like this:
> Select ...
> ON TA.Field1 = TB.Field1 AND
> TA.Field2 = TB.Field2 AND
> TA.Field3 = TB.Field3 AND
> TA.Field5 = 'const expression'
> Where
> TA.Field4 = 'Some Value'
> I want to make my queries as fast as possible (don't we all?) and I
> want to make sure that I use practices that others have found to
> reliably produce better results.
> Can you recommend any books that will help me to learn more about how
> Queries are optimized in SQL Server?
> Thanks again.
>|||IMO, your best performance will be realized if have built the appropriate
indexes on your tables.
I generally stay away from compound keys unless they are used specifically
for sorting.
When MS bought Fox Software they incorporated some of the technology known
then as Rushmore into SQL Server. Rushmore relied almost exclusively on
index schemes for speed.
If you have indexes build on the fields that will be most commonly used in
queries and join conditions, you will see better performance than if the
indexes did not exist. However, if the tables are small, you may not see
any improvement at all.
HTH,
-Steve-|||Hi Chris,
There is no significant performance difference between the two. You can
compare the query plans to check if they are the same (no knowledge of
query plans is required for that). If the are the same, then execution
(and performance) will be the same.
You ask for best practices. To me, the best way to write a query is to
make the query easy to read, easy to maintain, but still resulting in
efficient execution. In that order (performance last).
IMO, the best practice is to only specify the joining columns in the ON
clause, and all filtering expressions in the WHERE clause. Typically,
this means that you only mention the foreign key column(s) in the ON
clause.
Of course, this only applies to inner joins. For outer joins the
functional difference will determine which expressions are placed in the
ON clause and which in the WHERE clause.
Gert-Jan
Chris Dunaway wrote:
> Consider the following hypothetical queries:
> QUERY A
> Select ...
> ON TA.Field1 = TB.Field1 AND
> TA.Field2 = TB.Field2 AND
> TA.Field3 = TB.Field3
> Where
> TA.Field4 = 'Some Value'
> QUERY B
> Select ...
> ON TA.Field1 = TB.Field1
> Where
> TA.Field4 = 'Some Value' AND
> TA.Field2 = TB.Field2 AND
> TA.Field3 = TB.Field3
> Notice that the difference is that the Join clause in Query A has more
> than one expression to evaluate but in Query B, the Join only has a
> single expression and the other expressions have been moved to the
> Where clause.
> These queries should return the same results but my question is about
> performance. What are the best practices regarding Joins vs the Where
> clause? Do these queries have significantly differing performance?
> Which is preferred and why?
> Can someone point me to some articles or books that would help me to
> under stand?
> Thanks

No comments:

Post a Comment