From: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Markhof, Ingolf" <ingolf(dot)markhof(at)de(dot)verizon(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: [E] Regexp_replace bug / does not terminate on long strings |
Date: | 2021-08-20 19:32:26 |
Message-ID: | D84B8669-286F-4F90-89D6-6566D93C2C08@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> On Aug 20, 2021, at 9:52 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> "a*" is easy. "(a*)\1" is less easy --- if you let the a* consume the
> whole string, you will not get a match, even though one is possible.
> In general, backrefs create a mess in what would otherwise be a pretty
> straightforward concept :-(.
The following queries take radically different time to run:
\timing
select regexp_replace(
repeat('someone,one,one,one,one,one,one,', 60),
'(?<=^|,)([^,]+)(?:,\1)+(?=$|,)',
'\1', -- replacement
'g' -- apply globally (all matches)
);
Time: 16476.529 ms (00:16.477)
select regexp_replace(
repeat('someone,one,one,one,one,one,one,', 60),
'(?<=^|,)([^,]+)(?:,\1){5}(?=$|,)',
'\1', -- replacement
'g' -- apply globally (all matches)
);
Time: 1.452 ms
The only difference in the patterns is the + vs. the {5}. It looks to me like the first pattern should greedily match five ",one" matches and be forced to stop since ",someone" doesn't match, and the second pattern should grab the five ",one" matches it was told to grab and not try to grab the ",someone", but other than that, they should be performing the same work. I don't see why the performance should be so different.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Miles Elam | 2021-08-20 19:51:56 | Re: [E] Regexp_replace bug / does not terminate on long strings |
Previous Message | Tom Lane | 2021-08-20 16:52:44 | Re: [E] Re: Regexp_replace bug / does not terminate on long strings |