Quick Links

replace_text optimization (StringInfo to varlena)

From:	"Daniel Verite" <daniel(at)manitou-mail(dot)org>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	replace_text optimization (StringInfo to varlena)
Date:	2019-02-13 15:38:50
Message-ID:	bc319ec6-60d0-4878-a800-bcc12a190c02@manitou-mail.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

replace_text() in varlena.c builds the result in a StringInfo buffer,
and finishes by copying it into a freshly allocated varlena structure
with cstring_to_text_with_len(), in the same memory context.

It looks like that copy step could be avoided by preprending the
varlena header to the StringInfo to begin with, and return the buffer
as a text*, as in the attached patch.

On large strings, the time saved can be significant. For instance
I'm seeing a ~20% decrease in total execution time on a test with
lengths in the 2-3 MB range, like this:

select sum(length(
replace(repeat('abcdefghijklmnopqrstuvwxyz', i*10), 'abc', 'ABC')
))
from generate_series(10000,12000) as i;

Also, at a glance, there are a few other functions with similar
StringInfo-to-varlena copies that seem avoidable:
concat_internal(), text_format(), replace_text_regexp().

Are there reasons not to do this? Otherwise, should it be considered
in in a more principled way, such as adding to the StringInfo API
functions like void InitStringInfoForVarlena() and
text *StringInfoAsVarlena()?

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment	Content-Type	Size
replace-text-no-copy.patch	text/plain	743 bytes

Responses

Re: replace_text optimization (StringInfo to varlena) at 2019-02-14 07:32:44 from Kyotaro HORIGUCHI

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2019-02-13 15:39:21	Re: WAL insert delay settings
Previous Message	Stephen Frost	2019-02-13 15:31:29	Re: WAL insert delay settings