From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Compression of full-page-writes |
Date: | 2014-12-08 19:37:44 |
Message-ID: | CA+TgmoYhw0pkAD=nPPdpoeT0itF5S3sHO-wEWEx7k9bYZS8VqA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 8, 2014 at 2:21 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-12-08 14:09:19 -0500, Robert Haas wrote:
>> > records, just fpis. There is no evidence that we even want to compress
>> > other record types, nor that our compression mechanism is effective at
>> > doing so. Simple => keep name as compress_full_page_writes
>>
>> Quite right.
>
> I don't really agree with this. There's lots of records which can be
> quite big where compression could help a fair bit. Most prominently
> HEAP2_MULTI_INSERT + INIT_PAGE. During initial COPY that's the biggest
> chunk of WAL. And these are big and repetitive enough that compression
> is very likely to be beneficial.
>
> I still think that just compressing the whole record if it's above a
> certain size is going to be better than compressing individual
> parts. Michael argued thta that'd be complicated because of the varying
> size of the required 'scratch space'. I don't buy that argument
> though. It's easy enough to simply compress all the data in some fixed
> chunk size. I.e. always compress 64kb in one go. If there's more
> compress that independently.
I agree that idea is worth considering. But I think we should decide
which way is better and then do just one or the other. I can't see
the point in adding wal_compress=full_pages now and then offering an
alternative wal_compress=big_records in 9.5.
I think it's also quite likely that there may be cases where
context-aware compression strategies can be employed. For example,
the prefix/suffix compression of updates that Amit did last cycle
exploit the likely commonality between the old and new tuple. We
might have cases like that where there are meaningful trade-offs to be
made between CPU and I/O, or other reasons to have user-exposed knobs.
I think we'll be much happier if those are completely separate GUCs,
so we can say things like compress_gin_wal=true and
compress_brin_effort=3.14 rather than trying to have a single
wal_compress GUC and assuming that we can shoehorn all future needs
into it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-12-08 19:39:02 | Re: On partitioning |
Previous Message | Josh Berkus | 2014-12-08 19:30:30 | Re: On partitioning |