From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Joel Jacobson <joel(at)compiler(dot)org> |
Subject: | Re: Another regexp performance improvement: skip useless paren-captures |
Date: | 2021-08-08 17:04:38 |
Message-ID: | 3173384.1628442278@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> writes:
> The patch triggers an assertion that master does not:
> +select 'azrlfkjbjgidgryryiglcabkgqluflu' !~ '(.(.)((.)))((?:(\3)))';
On looking into this, it's pretty simple: regexec.c has an assertion
that a pure-capture subre node ought to be doing some capturing.
case '(': /* no-op capture node */
assert(t->child != NULL);
assert(t->capno > 0);
That's fine as of HEAD, but with the proposed patch, we may notice
that the node isn't actually referenced by any backref, and remove
its capture marker, allowing this assertion to fire. Nothing's
really wrong though.
There seem to be three things we could do about that:
1. Extend removecaptures() so that it can actually remove no-op
capture nodes if it's removed their capture markings. This would
substantially complicate that function, and I judge that it's not
worth the trouble. We'll only have such nodes in cases of
capturing parentheses immediately surrounding capturing parentheses,
which doesn't seem like a case worth expending sweat for.
2. Just drop the "t->capno > 0" assertion in regexec.c.
3. Weaken said assertion, perhaps by also checking the BRUSE flag bit.
Not sure that I see any point to #3, so I just dropped the
assertion in the attached.
I've also rebased over the bug fixes from the other thread,
and added a couple more test cases.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
optimize-useless-captures-2.patch | text/x-diff | 14.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Dilger | 2021-08-08 17:15:28 | Re: Assert triggered during RE_compile_and_cache |
Previous Message | Mark Dilger | 2021-08-08 17:00:10 | Re: Assert triggered during RE_compile_and_cache |