From: | Petr Jelinek <petr(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vladimir Leskov <vladimirlesk(at)yandex-team(dot)ru> |
Subject: | Re: pglz performance |
Date: | 2019-09-25 21:24:18 |
Message-ID: | 704b2f0a-fef3-cee0-bd55-b5704074f4c3@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 05/08/2019 09:26, Andres Freund wrote:
> Hi,
>
> On 2019-08-05 00:19:28 +0200, Petr Jelinek wrote:
>> It carries that information inside the compressed value, like I said in the
>> other reply, that's why the extra byte.
>
> I'm not convinced that that is a good plan - imo the reference to the
> compressed data should carry that information.
>
> I.e. in the case of toast, at least toast pointers should hold enough
> information to determine the compression algorithm. And in the case of
> WAL, the WAL record should contain that.
>
Point taken.
>
> For external datums I suggest encoding the compression method as a
> distinct VARTAG_ONDISK_COMPRESSED, and then have that include the
> compression method as a field.
So the new reads/writes will use this and reads of old format won't
change? Sounds fine.
>
> For in-line compressed values (so VARATT_IS_4B_C), doing something
> roughly like you did, indicating the type of metadata following using
> the high bit sounds reasonable. But I think I'd make it so that if the
> highbit is set, the struct is instead entirely different, keeping a full
> 4 byte byte length, and including the compression type header inside the
> struct. Perhaps I'd combine the compression type with the high-bit-set
> part? So when the high bit is set, it'd be something like
>
> {
> int32 vl_len_; /* varlena header (do not touch directly!) */
>
> /*
> * Actually only 7 bit, the high bit determines whether this
> * is the old compression header (if unset), or this type of header
> * (if set).
> */
> uint8 type;
>
> /*
> * Stored as uint8[4], to avoid unnecessary alignment padding.
> */
> uint8[4] length;
>
> char va_data[FLEXIBLE_ARRAY_MEMBER];
> }
>
Won't this break BW compatibility on big-endian (if I understand
corretly what you are trying to achieve here)?
> I think it's worth spending some effort trying to get this right - we'll
> be bound by design choices for a while.
>
Sure, however I am not in business of redesigning TOAST from scratch
right now, even if I do agree that the current header is far from ideal.
--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Yuli Khodorkovskiy | 2019-09-25 21:40:35 | Re: add a MAC check for TRUNCATE |
Previous Message | Petr Jelinek | 2019-09-25 21:14:18 | Re: DROP SUBSCRIPTION with no slot |