Quick Links

Re: Mixing greediness in regexp_matches

From:	"Daniel Verite" <daniel(at)manitou-mail(dot)org>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Mixing greediness in regexp_matches
Date:	2019-12-23 16:10:41
Message-ID:	90347d36-4b3c-4806-bd99-e5fae2cfad71@manitou-mail.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Tom Lane wrote:

> regression=# select regexp_split_to_array('junkfoolbarfoolishfoobarmore',
> 'foo|bar|foobar');
> regexp_split_to_array
> -----------------------
> {junk,l,"",lish,more}
> (1 row)
>
> The idea would be to iterate over the array elements, tracking the
> corresponding position in the source string, and re-discovering at
> each break which of the original alternatives must've matched.
>
> It's sort of annoying that we don't have a simple "regexp_location"
> function that would give you back the starting position of the
> first match.

It occurred to me too that regexp_split_to_table or array would make
this problem really easy if only it had a mode to capture and return the
matched parts too.

FWIW, in plperl, there's a simple solution:

$string =~ s/(foobar|foo|...)/$replace{$1}/g

when %replace is a hash of the substitutions %(foo=>baz,...).
The strings in the alternation are tested in their order of
appearance, so you can choose to be greedy or not by just sorting
them by length.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

In response to

Re: Mixing greediness in regexp_matches at 2019-12-23 15:26:15 from Tom Lane

Responses

Re: Mixing greediness in regexp_matches at 2019-12-23 16:16:58 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2019-12-23 16:16:58	Re: Mixing greediness in regexp_matches
Previous Message	Daniel Verite	2019-12-23 15:58:47	Re: Mixing greediness in regexp_matches