From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Binguo Bao <djydewang(at)gmail(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Optimize partial TOAST decompression |
Date: | 2019-09-30 17:29:51 |
Message-ID: | 20190930172951.gdedvexnf4d2wv5e@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Sep 30, 2019 at 09:20:22PM +0500, Andrey Borodin wrote:
>
>
>> 30 сент. 2019 г., в 20:56, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> написал(а):
>>
>> I mean this:
>>
>> /*
>> * Use int64 to prevent overflow during calculation.
>> */
>> compressed_size = (int32) ((int64) rawsize * 9 + 8) / 8;
>>
>> I'm not very familiar with pglz internals, but I'm a bit puzzled by
>> this. My first instinct was to compare it to this:
>>
>> #define PGLZ_MAX_OUTPUT(_dlen) ((_dlen) + 4)
>>
>> but clearly that's a very different (much simpler) formula. So why
>> shouldn't pglz_maximum_compressed_size simply use this macro?
>
>compressed_size accounts for possible increase of size during
>compression. pglz can consume up to 1 control byte for each 8 bytes of
>data in worst case.
OK, but does that actually translate in to the formula? We essentially
need to count 8-byte chunks in raw data, and multiply that by 9. Which
gives us something like
nchunks = ((rawsize + 7) / 8) * 9;
which is not quite what the patch does.
>Even if whole data is compressed well - there can be prefix compressed
>extremely ineffectively. Thus, if you are going to decompress rawsize
>bytes, you need at most compressed_size bytes of compressed input.
OK, that explains why we can't use the PGLZ_MAX_OUTPUT macro.
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2019-09-30 17:37:54 | Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock(PG10.7) |
Previous Message | Andres Freund | 2019-09-30 17:28:53 | Re: Proposal: Add more compile-time asserts to expose inconsistencies. |