From: | Sam Mason <sam(at)samason(dot)me(dot)uk> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Table and Index compression |
Date: | 2009-08-07 11:48:27 |
Message-ID: | 20090807114827.GB5407@samason.me.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 07, 2009 at 11:49:46AM +0100, Greg Stark wrote:
> On Fri, Aug 7, 2009 at 11:29 AM, Sam Mason<sam(at)samason(dot)me(dot)uk> wrote:
> > When you choose a compression algorithm you know how much space a worst
> > case compression will take (i.e. lzo takes up to 8% more for a 4kB block
> > size). This space should be reserved in case of situations like the
> > above and the filesystem shouldn't over-commit on this.
> >
> > Never had to think about this before though so I'm probably missing
> > something obvious.
>
> Well most users want compression for the space savings. So running out
> of space sooner than without compression when most of the space is
> actually unused would disappoint them.
Note, that as far as I can tell for a filesystems you only need to keep
enough reserved for the amount of uncompressed dirty buffers you have in
memory. As space runs out in the filesystem all that happens is that
the amount of (uncompressed?) dirty buffers you can safely have around
decreases. In practical terms, this says that performance drops off
when there is less free space than the size of the filesystem's cache
and I think you have to reserve exactly one block to handle the base
case. But there are so many problems associated with completely filling
a filesystem that I'm not sure if this would really matter.
> Also, I'm puzzled why it would the space increase would proportional
> to the amount of data and be more than 300 bytes. There's no reason it
> wouldn't be a small fixed amount. The ideal is you set aside one bit
> -- if the bit is set the rest is compressed and has to save at least
> one bit. If the bit is not set then the rest is uncompressed. Maximum
> bloat is 1-bit. In real systems it's more likely to be a byte or a
> word.
It'll depend on the compression algorithm; lz algorithms are dictionary
based so you'd have a single entry for the incompressible data and then
a pointer to the entry.
In PG's case, it would seem possible to do the compression and then
check to see if the resulting size is greater than 4kB. If it is you
write into the 4kB page size and write uncompressed data. Upon reading
you do the inverse, if it's 4kB then no need to decompress. I believe
TOAST does this already.
--
Sam http://samason.me.uk/
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Meskes | 2009-08-07 11:55:48 | Re: Split-up ECPG patches |
Previous Message | Pierre Frédéric Caillaud | 2009-08-07 11:06:10 | Re: Table and Index compression |