Quick Links

Re: Optimize partial TOAST decompression

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Binguo Bao <djydewang(at)gmail(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Optimize partial TOAST decompression
Date:	2019-10-01 13:18:03
Message-ID:	20191001131803.j6uin7nho7t6vxzy@development
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Oct 01, 2019 at 02:34:20PM +0200, Tomas Vondra wrote:
>On Tue, Oct 01, 2019 at 12:08:05PM +0200, Tomas Vondra wrote:
>>On Tue, Oct 01, 2019 at 11:20:39AM +0500, Andrey Borodin wrote:
>>>
>>>
>>>>30 сент. 2019 г., в 22:29, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> написал(а):
>>>>
>>>>On Mon, Sep 30, 2019 at 09:20:22PM +0500, Andrey Borodin wrote:
>>>>>
>>>>>
>>>>>>30 сент. 2019 г., в 20:56, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> написал(а):
>>>>>>
>>>>>>I mean this:
>>>>>>
>>>>>>/*
>>>>>> * Use int64 to prevent overflow during calculation.
>>>>>> */
>>>>>>compressed_size = (int32) ((int64) rawsize * 9 + 8) / 8;
>>>>>>
>>>>>>I'm not very familiar with pglz internals, but I'm a bit puzzled by
>>>>>>this. My first instinct was to compare it to this:
>>>>>>
>>>>>>#define PGLZ_MAX_OUTPUT(_dlen) ((_dlen) + 4)
>>>>>>
>>>>>>but clearly that's a very different (much simpler) formula. So why
>>>>>>shouldn't pglz_maximum_compressed_size simply use this macro?
>>>>
>>>>>
>>>>>compressed_size accounts for possible increase of size during
>>>>>compression. pglz can consume up to 1 control byte for each 8 bytes of
>>>>>data in worst case.
>>>>
>>>>OK, but does that actually translate in to the formula? We essentially
>>>>need to count 8-byte chunks in raw data, and multiply that by 9. Which
>>>>gives us something like
>>>>
>>>>nchunks = ((rawsize + 7) / 8) * 9;
>>>>
>>>>which is not quite what the patch does.
>>>
>>>I'm afraid neither formula is correct, but all this is hair-splitting differences.
>>>
>>
>>Sure. I just want to be sure the formula is safe and we won't end up
>>using too low value in some corner case.
>>
>>>Your formula does not account for the fact that we may not need all bytes from last chunk.
>>>Consider desired decompressed size of 3 bytes. We may need 1 control byte and 3 literals, 4 bytes total
>>>But nchunks = 9.
>>>
>>
>>OK, so essentially this means my formula works with whole chunks, i.e.
>>if we happen to need just a part of a decompressed chunk, we still
>>request enough data to decompress it whole. This way we may request up
>>to 7 extra bytes, which seems fine.
>>
>>>Binguo's formula is appending 1 control bit per data byte and one extra
>>>control byte. Consider size = 8 bytes. We need 1 control byte, 8
>>>literals, 9 total. But compressed_size = 10.
>>>
>>>Mathematically correct formula is compressed_size = (int32) ((int64)
>>>rawsize * 9 + 7) / 8; Here we take one bit for each data byte, and 7
>>>control bits for overflow.
>>>
>>>But this equations make no big difference, each formula is safe. I'd
>>>pick one which is easier to understand and document (IMO, its nchunks =
>>>((rawsize + 7) / 8) * 9).
>>>
>>
>>I'd use the *mathematically correct* formula, it doesn't seem to be any
>>more complex, and the "one bit per byte, complete bytes" explanation
>>seems quite understandable.
>>
>
>Pushed.
>
>I've ended up using the *mathematically correct* formula, hopefully
>with sufficient explanation why it's correct. I've also polished a
>couple more comments, and pushed like that.
>
>Thanks to Binguo Bao for this improvement, and all the reviewers in this
>thread.
>

Hmmm, this seems to trigger a failure on thorntail, which is a sparc64
machine (and it seems to pass on all x86 machines, so far). Per the
backtrace, it seems to have failed like this:

Core was generated by `postgres: parallel worker for PID 2341 '.
Program terminated with signal SIGUSR1, User defined signal 1.
#0 heap_tuple_untoast_attr_slice (attr=<optimized out>, sliceoffset=<optimized out>, slicelength=<optimized out>) at /home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/access/common/detoast.c:235
235 max_size = pglz_maximum_compressed_size(sliceoffset + slicelength,
#0 heap_tuple_untoast_attr_slice (attr=<optimized out>, sliceoffset=<optimized out>, slicelength=<optimized out>) at /home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/access/common/detoast.c:235
#1 0x00000100003d4ae8 in ExecInterpExpr (state=0x10000d02298, econtext=0x10000d01510, isnull=0x7feffb2fd1f) at /home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/executor/execExprInterp.c:690
...

so likely on this line:

max_size = pglz_maximum_compressed_size(sliceoffset + slicelength,
TOAST_COMPRESS_SIZE(attr));

the offset+length is just intereger arithmetics, so I don't see why that
would fail. So it has to be TOAST_COMPRESS_SIZE, which is defined like
this:

#define TOAST_COMPRESS_SIZE(ptr) ((int32) VARSIZE(ptr) - TOAST_COMPRESS_HDRSZ)

I wonder if that's wrong, somehow ... Maybe it should use VARSIZE_ANY,
but then how would it work on any platform and only fail on sparc64?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: Optimize partial TOAST decompression at 2019-10-01 12:34:20 from Tomas Vondra

Responses

Re: Optimize partial TOAST decompression at 2019-10-01 14:10:37 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2019-10-01 13:25:52	Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message	Andrew Dunstan	2019-10-01 12:40:26	Re: Proposal: Make use of C99 designated initialisers for nulls/values arrays