From: | Thomas Butz <tbutz(at)optitool(dot)de> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #16241: Degraded hash join performance |
Date: | 2020-02-05 11:06:41 |
Message-ID: | 1965637501.286993.1580900801394.JavaMail.zimbra@optitool.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
I've opened an issue: https://github.com/giggls/mapnik-german-l10n/issues/40
I suspect that the number of executed regexp_replace calls is the culprit here.
The cache of regexp.c seems to be limited to 32 entries which might be to low to keep all involved regexes cached.
> Hi,
>
> On 2020-02-04 11:00:29 -0500, Tom Lane wrote:
>> Andres Freund <andres(at)anarazel(dot)de> writes:
>> > Interesting! The no-children one clearly shows that a lot of the the
>> > time is spent evaluating regular expressions (there's other regex
>> > functions in the profile too):
>> > 23.36% postgres postgres [.] subcolor
>>
>> Huh ...
>>
>> > I'm not aware of any relevant regular expression evaluation changes
>> > between 11 and 12. Tom, does this trigger anything?
>>
>> (1) Nope, I'm not either; the last non-back-patched change in that
>> code was c54159d44 in v10.
>>
>> (2) subcolor() is part of regex compilation, not execution, which makes
>> one wonder why it's showing up at all. Maybe the regex cache in
>> adt/regexp.c is overflowing and preventing useful caching? But
>> that didn't change in v12 either. Are these test cases really
>> 100% equivalent? I'm wondering if there are a few more "hot"
>> regex patterns in the v12 data ...
>
> They are not 100% equivalent, but the part of the plan we see is very
> similar rowcount wise. It's possible that the functions differ more
> however, there are different postgis versions involved, and apparently
> also an "osml10n" extension.
>
>
>> (3) Where the heck is the regex use coming from at all? I don't
>> see any regex operators in the plan. Maybe it's inside the
>> plpgsql function?
>
> It definitely is. The stack shows at least two levels of plpgsql
> functions. And Thomas has since confirmed that removing the functioncall
> fixes the issue.
>
> Based on the name I think this is somewhere around this:
> https://github.com/giggls/mapnik-german-l10n/blob/master/plpgsql/get_localized_name_from_tags.sql#L120
> The callgraph indicates that most of the cost comes from within
> textregexreplace_noopt.
>
> Not clear why the cache isn't fixing this - there are no variables in
> the regexp_replace calls as far as I can see.
>
> Greetings,
>
> Andres Freund
--
Thomas Butz
From | Date | Subject | |
---|---|---|---|
Next Message | Emil Iggland | 2020-02-05 11:46:33 | Re: BUG #15858: could not stat file - over 4GB |
Previous Message | Arseny Sher | 2020-02-05 09:04:06 | Re: ERROR: subtransaction logged without previous top-level txn record |