Quick Links

Compressed TOAST Slicing

From:	Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Compressed TOAST Slicing
Date:	2018-11-01 20:55:16
Message-ID:	CACowWR07EDm7Y4m2kbhN_jnys=BBf9A6768RyQdKm_=NpkcaWg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Currently, PG_DETOAST_DATUM_SLICE when run on a compressed TOAST entry will
first decompress the whole object, then extract the relevant slice.

When the desired slice is at or near the front of the object, this is
obviously non-optimal.

The attached patch adds in a code path to do a partial decompression of the
TOAST entry, when the requested slice is at the start of the object.

For an example of the improvement possible, this trivial example:

create table slicingtest (
id serial primary key,
a text
);

insert into slicingtest (a) select repeat('xyz123', 10000) as a from
generate_series(1,10000);
\timing
select sum(length(substr(a, 0, 20))) from slicingtest;

On master, in the current state on my wee laptop, I get

Time: 1426.737 ms (00:01.427)

With the patch, on my wee laptop, I get

Time: 46.886 ms

As usual, doing less work is faster.

Interesting note to motivate a follow-on patch: the substr() function does
attempt to slice, but the left() function does not. So, if this patch is
accepted, next patch will be to left() to add slicing behaviour.

If nobody lights me on fire, I'll submit to commitfest shortly.

Attachment	Content-Type	Size
compressed-datum-slicing-20190101a.patch	application/octet-stream	5.6 KB

Responses

Re: Compressed TOAST Slicing at 2018-11-01 21:29:15 from Stephen Frost

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Thomas Munro	2018-11-01 21:06:47	Re: Hash Joins vs. Bloom Filters / take 2
Previous Message	Jim Finnerty	2018-11-01 20:23:06	Re: Hash Joins vs. Bloom Filters / take 2