From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | daniel(at)manitou-mail(dot)org |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: replace_text optimization (StringInfo to varlena) |
Date: | 2019-02-14 07:32:44 |
Message-ID: | 20190214.163244.187525748.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Wed, 13 Feb 2019 16:38:50 +0100, "Daniel Verite" <daniel(at)manitou-mail(dot)org> wrote in <bc319ec6-60d0-4878-a800-bcc12a190c02(at)manitou-mail(dot)org>
> Hi,
>
> replace_text() in varlena.c builds the result in a StringInfo buffer,
> and finishes by copying it into a freshly allocated varlena structure
> with cstring_to_text_with_len(), in the same memory context.
>
> It looks like that copy step could be avoided by preprending the
> varlena header to the StringInfo to begin with, and return the buffer
> as a text*, as in the attached patch.
>
> On large strings, the time saved can be significant. For instance
> I'm seeing a ~20% decrease in total execution time on a test with
> lengths in the 2-3 MB range, like this:
>
> select sum(length(
> replace(repeat('abcdefghijklmnopqrstuvwxyz', i*10), 'abc', 'ABC')
> ))
> from generate_series(10000,12000) as i;
>
> Also, at a glance, there are a few other functions with similar
> StringInfo-to-varlena copies that seem avoidable:
> concat_internal(), text_format(), replace_text_regexp().
>
> Are there reasons not to do this? Otherwise, should it be considered
> in in a more principled way, such as adding to the StringInfo API
> functions like void InitStringInfoForVarlena() and
> text *StringInfoAsVarlena()?
First, I agree that the waste of cycles should be eliminated.
Grepping with 'cstring_to_text_with_len\(.*[\.>]data,.*\)' shows
many instances of the use. Though StringInfo seems very
open-minded, the number of instances would be a good reason to
have new API functions.
That is, I vote for providing a set of API for the use in
StringInfo. But it seems to be difficult to name the latter
function. The name convention for the object is basically
<verb>StringInfo. getVarlenaStringInfo/getTextStringInfo
apparently fits the convention but seems to me a bit strange.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2019-02-14 07:34:07 | Re: pg_basebackup ignores the existing data directory permissions |
Previous Message | Noah Misch | 2019-02-14 07:26:23 | Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding |