Re: 8.0.3 regexp_replace()...

From: "rlee0001" <robeddielee(at)hotmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: 8.0.3 regexp_replace()...
Date: 2006-02-01 02:11:51
Message-ID: 1138759911.228641.129430@z14g2000cwz.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Martijn,

(Warning: This post contains somewhat of a long rant followed by a
question.)

I realize that NULL is the unknown value in SQL and that (most)
functions therefore treat it as such. I have no problem with "RETURNS
NULL ON NULL INPUT" except when a function returns NULL for no good
reason. If I were the ruler of the world I would declare that:

1) All functions correctly treat NULL input as "unknown" and if this
prevents them from returning a logical value they must return NULL.
(SQL has this already)
2) If an error occurs within a function which prevents the function
from returning a logical value (such as invalid inputs) the function
must raise an exception and not return. IE: No value is returned at all
-- not even NULL. PostgreSQL seems to already do this.
3) If a function can not calculate a return value due to some ambiguity
it should return NULL. This is really just an extention of rule 1.
4) If a function can calculate a return value (no processing error,
NULL input or other ambiguity exists) it MUST NOT return NULL.

My problem is with SUBSTRING. When it fails to find a match for the
regexp pattern within the source string it returns NULL. Why?! There is
no ambiguity nor NULL inputs! The result of the function is known to
logically by "No Results". So the only question is this: How do you
logically return that from a function?

a) An empty string is logically incorrect because an empty string is a
value and returning it would imply that it was matched.
b) NULL is logically incorrect because the result was known. NULL is
supposed to be returned when the result is unknown.

Does Postgres have a "EMPTY" or "NOTHING" return value? What does a
SELECT return when there are no matching records? NULL? Hell no! Then
what? Because whatever SELECT returns when it finds no matching records
is what SUBSTRING should return when there are no matching substrings.
Of course SELECT was never intended to return a SCALAR value either
while SUBSTRING was. Maybe "EMPTY" is something that SCALAR values
simply are no capable of specifying?

I am suggesting that the behaviour of SUBSTRING returning NULL when no
matches is found is either a bug in PostgreSQL or a flaw in the SQL
specification. It is not logical.

Another words I wanted by function to do this (in english):

Replace the next occurance of <pattern> in <sourcestring> with
<replacementstring>; if found. Repeat for all additional occurances; if
any.

At no point should any ambiguity ever be introduced into this function
so long as none of the three inputs are NULL since no external data
source is used. Therefore nothing should ever return NULL during the
operation of the function.

In my experience having to code "special cases" around the NULL value
is one of the biggest annoyances in SQL. Special cases should never
have to be coded for specific values that have no bearing on the
business logic of the application. As a programmer I want to be
reasonably assured that I can anticipate NULL popping up and that
anywhere that it might pop up it will be handled gracefully.

Obviously I'm somewhat new to SQL (less than 2 years) and especially to
PostgreSQL (about 2 months). In addition to returning null on null
input, what are the best programming practices for dealing with NULL
within a procedure/query? Was coalesce the best way to handle this?
Should I have just checked for NULL in my EXIT statement instead?

-Robert

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Fuhr 2006-02-01 02:44:44 Re: libpq questions
Previous Message Bob Pawley 2006-02-01 02:11:14 Re: Data Conversion