Quick Links

Re: pluggable compression support

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Subject:	Re: pluggable compression support
Date:	2013-06-15 11:56:36
Message-ID:	20130615115636.GA5875@alap2.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2013-06-15 13:25:49 +0200, Hannu Krosing wrote:
> On 06/15/2013 02:20 AM, Andres Freund wrote:
> > On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
> >> On 06/14/2013 04:01 PM, Andres Freund wrote:
> >>> It still contains a guc as described in the above message to control the
> >>> algorithm used for compressing new tuples but I think we should remove
> >>> that guc after testing.
> >> Did you add the storage attribute?
> > No. I think as long as we only have pglz and one new algorithm (even if
> > that is lz4 instead of the current snappy) we should just always use the
> > new algorithm. Unless I missed it nobody seemed to have voiced a
> > contrary position?
> > For testing/evaluation the guc seems to be sufficient.

> If not significantly harder than what you currently do, I'd prefer a
> true pluggable compression support which is

> a) dynamically configurable , say by using a GUG

> and

> b) self-describing, that is, the compressed data should have enough
> info to determine how to decompress it.

Could you perhaps actually read the the description and the discussion
before making wild suggestions? Possibly even the patch.
Compressed toast datums now *do* have an identifier of the compression
algorithm used. That's how we can discern between pglz and whatever
algorithm we come up with.

But those identifiers should be *small* (since they are added to all
Datums) and they need to be stable, even across pg_upgrade. So I think
making this user configurable would be grave error at this point.

> additionally it *could* have the property Simon proposed earlier
> of *uncompressed* pages having some predetermined size, so we
> could retain optimisations of substring() even on compressed TOAST
> values.

We are not changing the toast format here, so I don't think that's
applicable. That's a completely separate feature.

> the latter of course could also be achieved by adding offset
> column to toast tables as well.

> One more idea - if we are already changing toast table structure, we
> could introduce a notion of "compress block", which could run over
> several storage pages for much improved compression compared
> to compressing only a single page at a time.

We aren't changing the toast table structure. And we can't easily do so,
think of pg_upgrade.
Besides, toast always has compressed datums over several chunks. What
would be beneficial would be to compress in a way that you can compress
several datums together, but that's several magnitudes more complex and
is unrelated to this feature.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: pluggable compression support at 2013-06-15 11:25:49 from Hannu Krosing

Responses

Re: pluggable compression support at 2013-06-15 12:11:54 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2013-06-15 12:02:10	Re: pluggable compression support
Previous Message	Cédric Villemain	2013-06-15 11:30:53	Re: [PATCH] Remove useless USE_PGXS support in contrib