If I had to summarise my position on allowing NULL values, then I think of them similar to how I think of a loaded gun - not directly harmful itself, but a there is a reasonable chance of it going wrong and resulting in serious damages. If you need to use them, and you have a clear reason why, then use them. But otherwise avoid them as much as possible.
Allowing NULL values in a column can be bad for all kind of direct and indirect reasons.
- Oracle does not store NULL values in normal B-Tree indexes, which limits when the indexes can be used on columns that allow NULL values
- Such an index cannot be used to satisfy an "IS NULL" query constraint
- The index also cannot be used to satisfy a "column != value" query constraint
- Foreign keys that allow NULL are "optional", but again have index and optimizer side effects.
- Outer joins may become necessary, which again limit the choices available to the Optimizer.
- Many programming languages have no concept of NULL, being a value outside of the normal value range by definition.
- Either NULL values require special programming, or require some kind of shadow boolean variable to indicate if it is really NULL (no real value in the main variable).
- Many programmers struggle with the difference between NULL and empty or blank, which can all mean different things within a programming language e.g. a zero length string, or the number zero.
- NULL
cannot be using in normal equality and range tests. As it is outside
the normal value range it cannot be greater than or less than a real
value. And it cannot be equal to any other value in a column, including
another NULL. You must use the special constraint "
IS NULL
" to test for a NULL value, and never "= NULL
". - Calculations involving a NULL value evaluate to NULL i.e. unknown.
6 comments:
Nice post, agreed with what you're saying. NULLs are dangerous in a physical database design mainly because they are difficult to deal with correctly - bugs are easy to introduce and it takes a lot of care and experience to be aware of them and code defensively; and even then it's easy to get it wrong :)
Just one minor correction: an index on a column can be used by the CBO with a != predicate, just as it can with a = predicate. It's perhaps not often likely to, but it is possible.
Jeff,
I don't think an index can be used for a != test when the column being tested allows NULL values. This is because the NULL values are not stored in the index - only non-NULL values. So the only way to find all the other values in this case - including the NULL values - would be to do a full table scan.
I've not proved this (no test cases I'm afraid), but I'm sure I have seen this stated by other people. And it was one reason why I was not getting an expected execution plan from a complex query (other factors involved as well).
A composite B*tree index may be used to search for NULL values, as long as at least one column value in the index definition is not nullable. A constant number such as 0 or 1 may be used as the second column in an index definition to satisfy this restriction, for example:
CREATE INDEX IND_T1_C3 ON T1(C3,0);
With such an index, the optimizer may select to use the index when the following is found in the WHERE clause:
WHERE C3 IS NULL
An index may also be used to satisfy the != test, but the optimizer is very unlikely to select that access path without adding an INDEX hint to the SQL statement.
John,
Looks like your post is "inspired" by your question on forum recently.
Anyways, you said in response to Jeff that So the only way to find all the other values in this case - including the NULL values - would be to do a full table scan.
But you can not find all other values, including NULL values, using != operator. Because a NULL value is neither equal to nor not equal to any other value. See below
SQL> with src as (select level as l from dual connect by level <= 5 union all select null from dual)
2 select * from src where l != 3 ;
L
----------
1
2
4
5
4 rows selected.
Narenda,
I am writing a follow up post, and have just hit exactly the issue you mentioned - NULL is neither equal to nor unequal to any real value. I ran a "!=" query and got different row counts whether the column allowed NULL values or not. Thanks for pointing out yet another distinction about NULL values.
An index most certainly can be used for a != predicate, proof:
create table jefftest (n number);
create index jefftesti on jefftest (n);
insert into jefftest values (null);
insert into jefftest values (1);
insert into jefftest values (2);
commit;
explain plan for select * from jefftest where n != 2;
--------------------------------------
| Id | Operation | Name |
--------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX FULL SCAN | JEFFTESTI |
--------------------------------------
Post a Comment