From: | Jeremy Drake <pgsql(at)jdrake(dot)com> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>, Neil Conway <neilc(at)samurai(dot)com>, David Fetter <david(at)fetter(dot)org> |
Subject: | Re: patch adding new regexp functions |
Date: | 2007-02-17 08:23:17 |
Message-ID: | Pine.BSO.4.64.0702170005560.18849@resin.csoft.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Sat, 17 Feb 2007, Peter Eisentraut wrote:
> Jeremy Drake wrote:
> > In case you haven't noticed, I am rather averse to making this return
> > text[] because it is much easier in my experience to use the results
> > when returned in SETOF rather than text[],
>
> The primary use case I know for string splitting is parsing
> comma/pipe/whatever separated fields into a row structure, and the way
> I see it your API proposal makes that exceptionally difficult.
For this case see string_to_array:
http://developer.postgresql.org/pgdocs/postgres/functions-array.html
select string_to_array('a|b|c', '|');
string_to_array
-----------------
{a,b,c}
(1 row)
> I don't know what your use case is, though. All of this is missing
> actual use cases.
The particular use case I had for this function was at a previous
employer, and I am not sure exactly how much detail is appropriate to
divulge. Basically, the project was doing some text processing inside of
postgres, and getting all of the words from a string into a table with
some processing (excluding stopwords and so forth) as efficiently as
possible was a big concern.
The regexp_split function code was based on some code that a friend of
mine wrote which used PCRE rather than postgres' internal regexp support.
I don't know exactly what his use-case was, but he probably had
one because he wrote the function and had it returning SETOF text ;)
Perhaps he can share a general idea of what it was (nudge nudge)?
> > While, if you
> > really really wanted a text[], you could use the (fully documented)
> > ARRAY(select resultstr from regexp_split(...) order by startpos)
> > construct.
>
> I think, however, that we should be providing simple primitives that can
> be combined into complex expressions rather than complex primitives
> that have to be dissected apart to get simple results.
The most simple primitive is string_to_array(text, text) returns text[],
but it was not sufficient for our needs.
> > > As for the regexp_matches() function, it seems to me that it
> > > returns too much information at once. What is the use case for
> > > getting all of prematch, fullmatch, matches, and postmatch in one
> > > call?
> >
> > It was requested by David Fetter:
> > http://archives.postgresql.org/pgsql-hackers/2007-02/msg00056.php
> >
> > It was not horribly difficult to provide, and it seemed reasonable to
> > me. I have no need for them personally.
>
> David Fetter has also repeated failed to offer a use case for this, so I
> hesitate to accept this.
I have no strong opinion either way, so I will let those who do argue it
out and wait for the dust to settle ;)
--
The Law, in its majestic equality, forbids the rich, as well as the
poor, to sleep under the bridges, to beg in the streets, and to steal
bread.
-- Anatole France
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2007-02-17 08:50:41 | Re: RFC: Temporal Extensions for PostgreSQL |
Previous Message | David Fetter | 2007-02-17 08:16:06 | Re: patch adding new regexp functions |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2007-02-17 09:20:08 | Re: patch adding new regexp functions |
Previous Message | David Fetter | 2007-02-17 08:16:06 | Re: patch adding new regexp functions |