From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-bugs(at)postgreSQL(dot)org |
Subject: | string_to_array() is confused by ambiguous field separator |
Date: | 2006-10-06 21:12:10 |
Message-ID: | 6008.1160169130@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Good:
regression=# select string_to_array('123xx456xx789', 'xx');
string_to_array
-----------------
{123,456,789}
(1 row)
Not so good:
regression=# select string_to_array('123xx456xxx789', 'xx');
ERROR: negative substring length not allowed
The proximate problem is that in the inner loop in text_position(),
if it finds a match but hasn't yet found matchnum of them, it advances
only one character instead of advancing over the whole match. This
means it can report overlapping successive matches, which leads to an
invalid subscript calculation in text_to_array(). I think the correct
approach is to ignore overlapping matches, so that the result in the
second case would be
{123,456,x789}
There's another problem here, which is that the API of text_position()
is poorly chosen anyway: as defined, parsing a string of N fields
requires O(N^2) work. It'd be better to pass it a starting character
number for the search instead of a field number to find, and to break
out the setup step so that we don't have to repeat the conversion to
pg_wchar format for each field.
Any objections?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jean Tourrilhes | 2006-10-07 01:28:44 | BUG #2681: duplicate key violates unique constraint |
Previous Message | Tom Lane | 2006-10-06 19:07:18 | Re: BUG #2674: libedit not detected |