From: | "Otto Hirr" <otto(dot)hirr(at)olabinc(dot)com> |
---|---|
To: | <dev(at)archonet(dot)com>, <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: A brief guide to nulls |
Date: | 2003-01-15 18:30:50 |
Message-ID: | 002d01c2bcc4$3abb23b0$fc2b5d42@frodo |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Another viewpoint, is to go a bit further into implementation...
i.e. for every field, there are two items...
*one item being the null/notnull info
*other item being the value info of the field per the type at def time
This faciliatates the concepts of
where <fieldname> is null ... testing the null/notnull info
where <fieldname> is not null ... testing the null/notnull info
where <fieldname> = "somevalue" ... testing the value info
etc.
Trying to "hide" the concept that every field has TWO items,
the value and whether it is null/notnull, seems to confuse people.
By SIMPLY stating that every SQL field has two items, then causes
a scripter to think a bit more deeply about a where clause...
Hence making your example ...
> SELECT * FROM customer WHERE sex='M';
> SELECT * FROM customer WHERE sex<>'M';
> Now you might think this returns all customers, but it will miss those
> where sex is null. You've asked for all rows where the value of sex is 'M'
> and all those with values not equal to 'M' but not rows with *no value at
all*
all that less mysterious...
i.e. the above two selects only tested one of the two items for the SQL
field,
the other item to test is the null/notnull item.
.. Otto
> -----Original Message-----
> From: pgsql-sql-owner(at)postgresql(dot)org
> [mailto:pgsql-sql-owner(at)postgresql(dot)org]On Behalf Of dev(at)archonet(dot)com
> Sent: Wednesday, January 15, 2003 9:23 AM
> To: pgsql-sql(at)postgresql(dot)org
> Subject: RFC: A brief guide to nulls
>
>
> There have been a few posts recently where people have had
> problems with
> nulls. Anyone got comments on the below before I submit it to
> techdocs?
>
> TIA
>
> - Richard Huxton
>
> A Brief Guide to NULLs
> ======================
>
> What is a null?
> ===============
> A null is *not* an empty string.
> A null is *not* a value.
> A null is *not* a "special" value.
> A null is the absence of a value.
>
>
> What do nulls mean?
> ===================
> Well, they *should* mean one of two things:
> 1. There is no applicable value
> 2. There is a value but it is unknown
>
> Example 1: Imagine you have a customer table with name and sex fields.
> If you get a new customer "ACME Widgets Ltd", the sex field
> is meaningless
> since your customer is a company (case 1).
> If you get a new customer "Jackie Smith" they might be male
> or female, but
> you might not know (case 2).
>
> Example 2: You have an address table with
> (street,city,county,postalcode)
> fields.
> You might insert an address ("10 Downing
> Street","London",Null,"WC1 1AA")
> since you don't have a valid county.
> You might also insert an address ("1 Any
> Street","Maidstone","Kent",Null)
> where there *must be* a valid postalcode, but you don't know
> what it is.
>
> It might be useful to be able to distinguish between these
> two cases - not
> applicable and unknown, but there is only one option "Null"
> available to
> us, so we can't.
>
>
> How do nulls work?
> ==================
> There is one very important rule when dealing with nulls. The
> result of
> any operation or comparison, when applied to a null is null. The only
> exception is testing if a value is null.
>
> Example: with the customer table above you could run the
> following queries:
> SELECT * FROM customer WHERE sex='M';
> SELECT * FROM customer WHERE sex<>'M';
> Now you might think this returns all customers, but it will miss those
> where sex is null. You've asked for all rows where the value
> of sex is 'M'
> and all those with values not equal to 'M' but not rows with
> *no value at
> all*
>
> It might help to think of a database as a set of statements
> you *know* to
> be true. A null indicates that you *cannot say anything at
> all* about that
> field. You can't say what it is, you can't say what it isn't,
> you can only
> say there is some information missing.
>
> So, to see all the customers with unknown or inapplicable sex
> you would need:
> SELECT * FROM customer WHERE sex IS NULL;
>
> There are actually three possible results for a test in SQL -
> True (the
> test passed), False (the test failed) and Null (you tested against a
> null). A result of null usually gets treated as False, so
> testing against
> nulls always fails.
>
> If you try to perform an operation on nulls, again the result
> is always
> null. So the results of all of the following are null:
> SELECT 'abc' || null;
> SELECT 1 + null;
> SELECT sqrt(null::numeric);
> The first case can be especially confusing. Concatenating a
> null string to
> a string value will return null, not the original value.
>
>
> Uniqueness and nulls
> ====================
> If you define a unique index on a column it prevents you inserting two
> values that are the same. It does not prevent you inserting
> as many nulls
> as you like. How could it, you don't have a value so it can't
> be the same
> as any other.
>
> Example: We create a table "ta" with a unique constraint on column "b"
> CREATE TABLE ta (
> a int4,
> b varchar(3),
> PRIMARY KEY (a)
> );
> CREATE UNIQUE INDEX ta_b_idx ON ta (b);
> INSERT INTO ta VALUES (1,'aaa'); -- succeeds
> INSERT INTO ta VALUES (2,'bbb'); -- succeeds
> INSERT INTO ta VALUES (3,null); -- succeeds
> INSERT INTO ta VALUES (4,'bbb'); -- fails
> INSERT INTO ta VALUES (5,null); -- succeeds!
>
> Given the definition of what a null is, you only have two
> choices: allow
> multiple nulls or allow no nulls. If you want no nulls,
> define the column
> as NOT NULL when creating the table.
>
>
> Keys and nulls
> ==============
> No column that is part of a primary key can be null. When you define a
> PRIMARY KEY, none of the columns mentioned can take a null value.
> Postgresql makes sure of this by defining the columns as NOT
> NULL for you.
>
> Example: With table "ta" we just created, \d ta will show column a as
> being not null. Otherwise, we could insert rows with a set to null and
> have no way to tell them apart.
>
>
> Subqueries and nulls
> ====================
> Since tests always fail when testing against nulls you can
> have unexpected
> results with sub-queries.
>
> Example: Assume we have a companies table and a diary table.
> Diary entries
> are usually related to a particular company but not always.
> SELECT co_name FROM companies WHERE co_id NOT IN (SELECT
> dy_company FROM
> diary);
> If any row in diary contains a null dy_company then you will get *no
> results*.
> We can expand the query like so:
> WHERE co_id NOT IN (SELECT dy_company FROM diary)
> WHERE co_id NOT IN (1, 2, null, 3...)
> WHERE NOT (co_id=1 OR co_id=2 OR co_id=null OR co_id=3...)
> WHERE NOT (... OR null OR ...)
> WHERE NOT (null)
> WHERE null
>
> You either need to explicitly check for null values, or
> define the column
> in question as NOT NULL.
>
>
> Further Information
> ===================
> Any good relational database book (try something written by Date)
> Bruce's book (link LHS at http://techdocs.postgresql.org)
> My Postgresql Notes (link at http://techdocs.postgresql.org)
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Vernon Wu | 2003-01-15 18:33:11 | Re: query speed joining tables |
Previous Message | Matthew Nuzum | 2003-01-15 18:29:47 | lost on self joins |