From: | Hannu Krosing <hannu(at)skype(dot)net> |
---|---|
To: | Luke Lonergan <llonergan(at)greenplum(dot)com> |
Cc: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Neil Conway <neilc(at)samurai(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: TOAST compression |
Date: | 2006-02-26 20:19:45 |
Message-ID: | 1140985185.3716.44.camel@localhost.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Ühel kenal päeval, P, 2006-02-26 kell 09:31, kirjutas Luke Lonergan:
> Jim,
>
> On 2/26/06 8:00 AM, "Jim C. Nasby" <jnasby(at)pervasive(dot)com> wrote:
>
> > Any idea on how decompression time compares to IO bandwidth? In other
> > words, how long does it take to decompress 1MB vs read that 1MB vs read
> > whatever the uncompressed size is?
>
> On DBT-3 data, I've just run some tests meant to simulate the speed
> differences of compression versus native I/O. My thought is that an
> external use of gzip on a binary dump file should be close to the speed of
> LZW on toasted fields,
Your basic assumption si probbaly wrong :(
gzip what ? "compression level" setting of gzip has big effect on both
compression speed and compression rate. And I suspect that even the
fastest level (gzip -1) compresses slower and better than postgresql's
lzcompress.
> so I just dumped the "supplier" table (see below) of
> size 202MB in data pages to disk, then ran gzip/gunzip on the the binary
> file. Second test - an 8k block dd from that same file, meant to simulate a
> seq scan (it's faster by 25% than doing it in PG though):
>
> ==================== gzip/gunzip =====================
> [mppdemo1(at)salerno0]$ ls -l supplier.bin
> -rw-r--r-- 1 mppdemo1 mppdemo1 177494266 Feb 26 09:17 supplier.bin
>
> [mppdemo1(at)salerno0]$ time gzip supplier.bin
>
> real 0m12.979s
> user 0m12.558s
> sys 0m0.400s
> [mppdemo1(at)salerno0]$ time gunzip supplier.bin
>
> real 0m2.286s
> user 0m1.713s
> sys 0m0.573s
these are also somewhat bogus tests, if you would want them to be
comparable with dd below, you should have used 'time gzip -c
supplier.bin > /dev/null'
> [mppdemo1(at)salerno0]$ time dd if=supplier.bin of=/dev/null bs=8k
> 21666+1 records in
> 21666+1 records out
>
> real 0m0.138s
> user 0m0.003s
> sys 0m0.135s
----------------
Hannu
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2006-02-26 20:59:50 | Re: What's with this lib suffix? |
Previous Message | Tino Wildenhain | 2006-02-26 19:20:28 | Re: Pl/Python -- current maintainer? |