Re: Problem with a Pettern Matching Check

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: Sebastian Siewior <lavish(at)kamp-dsl(dot)de>, pgsql-sql(at)postgresql(dot)org
Subject: Re: Problem with a Pettern Matching Check
Date: 2005-08-16 01:14:58
Message-ID: 19851.1124154898@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Michael Fuhr <mike(at)fuhr(dot)org> writes:
> On Mon, Aug 15, 2005 at 08:21:23PM -0400, Tom Lane wrote:
>> Given that we consider trailing spaces in char(n) to be semantically
>> insignificant, would it make sense to strip them before doing the
>> regex pattern match?

> How standards-compliant would that be? Does the standard specify
> what should happen when using SIMILAR TO with a char(n) value?

Hmm ... suddenly I'm getting a strong sense of deja vu ... think we've
been around this merry-go-round before. SQL99 says

ii) The <predicate>

MC LIKE PC

is true if there exists a partitioning of MCV into
substrings such that:

1) A substring of MCV is a sequence of 0 (zero) or more
contiguous <character representation>s of MCV and each
<character representation> of MCV is part of exactly one
substring.

2) If the i-th substring specifier of PCV is an arbitrary
character specifier, the i-th substring of MCV is any
single <character representation>.

3) If the i-th substring specifier of PCV is an arbitrary
string specifier, then the i-th substring of MCV
is any sequence of 0 (zero) or more <character
representation>s.

4) If the i-th substring specifier of PCV is neither an
arbitrary character specifier nor an arbitrary string
specifier, then the i-th substring of MCV is equal to
that substring specifier according to the collating
sequence of the <like predicate>, without the appending
of <space> characters to MCV, and has the same length as
that substring specifier.

5) The number of substrings of MCV is equal to the number
of substring specifiers of PCV.

Rule ii.4 says that you use the collating sequence associated with the
data values, which is where the SQL spec keeps its space sensitivity
information --- but the restrictions about not adding space characters
and having the same length seem to be intended to prevent use of
pad-space-insensitivity to create a match.

I think we read this text before, came to the same conclusion, and
put in the special operator to make it behave that way. So ...
never mind.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Stephan Szabo 2005-08-16 01:48:28 Re: Parentheses in FROM clause and evaluation order.
Previous Message Michael Fuhr 2005-08-16 00:43:49 Re: Problem with a Pettern Matching Check