Monday, March 19, 2012

Joining on NULLS

I have a stored proc that matches 2 tables on a datetime column. I am
worried that matching on a NULL column is not a good idea. Should I consider
matching on a ISNULL(datecolumn1,0) = ISNULL(datecolumn2,0) a better idea?
Thanks.
DavidI believe outer joins should take care of this.
"David Chase" <dlchase@.lifetimeinc.com> wrote in message
news:%23GBhm7xCGHA.3812@.TK2MSFTNGP15.phx.gbl...
>I have a stored proc that matches 2 tables on a datetime column. I am
>worried that matching on a NULL column is not a good idea. Should I
>consider matching on a ISNULL(datecolumn1,0) = ISNULL(datecolumn2,0) a
>better idea? Thanks.
> David
>|||> worried that matching on a NULL column is not a good idea. Should I
> consider matching on a ISNULL(datecolumn1,0) = ISNULL(datecolumn2,0) a
> better idea?
Does this mean that you intended to match NULLs (i.e., you consider NULL =
NULL to be true)? If so, then your logic is dependent on the current
ANSI_NULLS setting. There is a section in BOL (sigh - as there usually is)
that discusses this particular issue - Accessing and Changing Relational
Data / Query Fundamentals / Filtering Rows with WHERE and HAVING / NULL
Comparison Search Conditions.
As for whether this (or any other approach) is "better" depends on many
factors. A connection setting dependency is generally not recommended.
Your "better" approach is dependent on knowledge of the domain of the
columns - is this dependency any "better" than the connection setting? I'll
ignore the use of the implicit conversion, something that can easily trip an
unsuspecting reader.
The short answer is that it IS better to use a more defensive approach to
coding. However, this particular case will also involve the identification
of a technique that yields the best performance for your given situation.
Performance is often driven as much by the batch characteristics (e.g., use
of parameters, plan usage) as it is by the query and schema. Logically, you
should use something like:
where (col1 = col2) or (col1 is null and col2 is null)
Isnull and coalesce can be used - as you indicated. There might also be
other ways of looking at the data that would lead you to a different
approach. If you are attempting to equate NULLs, perhaps this is an
indication of a flaw in the data model. If so, the "better" approach is to
find and fix this model flaw. A better model generally improves the system
as a whole, often by orders of magnitude.|||not sure what you mean:
"matching on a null column" to me means "null = null" - if that's your
meaning, then isnull(datecolumn1,0)=isnull(datecolumn2
,0) is the same thing.
yes - joining on nulls is a bad idea -- if the other join criteria (if
any) isn't selective enough, then you'll get a cartesian product for
these (x nulls in table1 * x nulls in table2).
why would you want to match them?
David Chase wrote:
> I have a stored proc that matches 2 tables on a datetime column. I am
> worried that matching on a NULL column is not a good idea. Should I consid
er
> matching on a ISNULL(datecolumn1,0) = ISNULL(datecolumn2,0) a better idea?
> Thanks.
> David
>|||clarification: by "null = null" i mean that these would "match", not
that you would use "where null = null".
"null = null" evaluates to null unless SET ANSI_NULLS is OFF.
therefore, these would not be included in the results, whereas
isnull(datecolumn1,0)=isnull(datecolumn2
,0) would be included in the
results regardless of ANSI_NULLS setting.
but the end result in the narrative is the same - consider nulls a match
and return them in the result set.
Trey Walpole wrote:
> not sure what you mean:
> "matching on a null column" to me means "null = null" - if that's your
> meaning, then isnull(datecolumn1,0)=isnull(datecolumn2
,0) is the same
> thing.
> yes - joining on nulls is a bad idea -- if the other join criteria (if
> any) isn't selective enough, then you'll get a cartesian product for
> these (x nulls in table1 * x nulls in table2).
> why would you want to match them?
> David Chase wrote:
>|||> "null = null" evaluates to null unless SET ANSI_NULLS is OFF.
Not for a JOIN operation, though. ANSI_NULLS does not change the meaning of
NULL = NULL for a join,
the unknown will still be false in the end:
USE tempdb
CREATE TABLE t1(c1 datetime, c2 int)
CREATE TABLE t2(c1 datetime, c2 int)
INSERT INTO t1 VALUES(NULL, 1)
INSERT INTO t2 VALUES(NULL, 3)
INSERT INTO t1 VALUES('20050101', 2)
INSERT INTO t2 VALUES('20050101', 4)
INSERT INTO t1 VALUES('20050102', 5)
INSERT INTO t2 VALUES('20050103', 6)
SELECT * FROM t1
SELECT * FROM t2
SET ANSI_NULLS OFF
SELECT *
FROM t1
INNER JOIN t2 ON t1.c1 = t2.c1
SELECT *
FROM t1 ,t2
WHERE t1.c1 = t2.c1
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Trey Walpole" <treypole@.newsgroups.nospam> wrote in message
news:ejF54IzCGHA.2040@.TK2MSFTNGP14.phx.gbl...
> clarification: by "null = null" i mean that these would "match", not that
you would use "where
> null = null".
> "null = null" evaluates to null unless SET ANSI_NULLS is OFF.
> therefore, these would not be included in the results, whereas
> isnull(datecolumn1,0)=isnull(datecolumn2
,0) would be included in the resu
lts regardless of
> ANSI_NULLS setting.
> but the end result in the narrative is the same - consider nulls a match a
nd return them in the
> result set.
>
> Trey Walpole wrote:|||ah yes - thanks for the clarification
Tibor Karaszi wrote:
>
> Not for a JOIN operation, though. ANSI_NULLS does not change the meaning
> of NULL = NULL for a join, the unknown will still be false in the end:
> USE tempdb
> CREATE TABLE t1(c1 datetime, c2 int)
> CREATE TABLE t2(c1 datetime, c2 int)
> INSERT INTO t1 VALUES(NULL, 1)
> INSERT INTO t2 VALUES(NULL, 3)
> INSERT INTO t1 VALUES('20050101', 2)
> INSERT INTO t2 VALUES('20050101', 4)
> INSERT INTO t1 VALUES('20050102', 5)
> INSERT INTO t2 VALUES('20050103', 6)
> SELECT * FROM t1
> SELECT * FROM t2
> SET ANSI_NULLS OFF
> SELECT *
> FROM t1
> INNER JOIN t2 ON t1.c1 = t2.c1
> SELECT *
> FROM t1 ,t2
> WHERE t1.c1 = t2.c1
>

No comments:

Post a Comment