From: | Sam Mason <sam(at)samason(dot)me(dot)uk> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Table and Index compression |
Date: | 2009-08-07 12:38:36 |
Message-ID: | 20090807123835.GD5407@samason.me.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 07, 2009 at 12:59:57PM +0100, Greg Stark wrote:
> On Fri, Aug 7, 2009 at 12:48 PM, Sam Mason<sam(at)samason(dot)me(dot)uk> wrote:
> >> Well most users want compression for the space savings. So running out
> >> of space sooner than without compression when most of the space is
> >> actually unused would disappoint them.
> >
> > Note, that as far as I can tell for a filesystems you only need to keep
> > enough reserved for the amount of uncompressed dirty buffers you have in
> > memory. As space runs out in the filesystem all that happens is that
> > the amount of (uncompressed?) dirty buffers you can safely have around
> > decreases.
>
> And when it drops to zero?
That was why I said you need to have one page left "to handle the base
case". I was treating the inductive case as the interesting common case
and considered the base case of lesser interest.
> > In PG's case, it would seem possible to do the compression and then
> > check to see if the resulting size is greater than 4kB. If it is you
> > write into the 4kB page size and write uncompressed data. Upon reading
> > you do the inverse, if it's 4kB then no need to decompress. I believe
> > TOAST does this already.
>
> It does, as does gzip and afaik every compression system.
It's still a case that needs to be handled explicitly by the code. Just
for reference, gzip does not appear to do this when I test it:
echo -n 'a' | gzip > tmp.gz
gzip -l --verbose tmp.gz
says the compression ratio is "-200%" (an empty string results in
an infinite increase in size yet gets displayed as "0%" for some
strange reason). It's only when you hit six 'a's that you start to get
positive ratios. Note that that this is taking headers into account;
the compressed size is 23 bytes for both 'aaa' and 'aaaaaa' but the
uncompressed size obviously changes.
gzip does indeed have a "copy" method, but it doesn't seem to be being
used.
--
Sam http://samason.me.uk/
From | Date | Subject | |
---|---|---|---|
Next Message | Pierre Frédéric Caillaud | 2009-08-07 12:44:48 | Re: Table and Index compression |
Previous Message | Greg Stark | 2009-08-07 12:18:22 | Re: Table and Index compression |