From: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com> |
---|---|
To: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zeugswetter Andreas DCP SD <ZeugswetterA(at)spardat(dot)at>, Greg Stark <gsstark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, "Bort, Paul" <pbort(at)tmwsystems(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Compression and on-disk sorting |
Date: | 2006-05-19 18:39:45 |
Message-ID: | 20060519183944.GB64371@pervasive.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote:
> On Thu, May 18, 2006 at 10:02:44PM -0500, Jim C. Nasby wrote:
> > http://jim.nasby.net/misc/compress_sort.txt is preliminary results.
> > I've run into a slight problem in that even at a compression level of
> > -3, zlib is cutting the on-disk size of sorts by 25x. So my pgbench sort
> > test with scale=150 that was producing a 2G on-disk sort is now
> > producing a 80M sort, which obviously fits in memory. And cuts sort
> > times by more than half.
>
> I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost
> unbeleiveable. What's in the table? It would seem to imply that our
> tuple format is far more compressable than we expected.
It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a;
If the tape routines were actually storing visibility information, I'd
expect that to be pretty compressible in this case since all the tuples
were presumably created in a single transaction by pgbench.
If needs be, I could try the patch against http://stats.distributed.net,
assuming that it would apply to REL_8_1.
> Do you have any stats on CPU usage? Memory usage?
I've only been taking a look at vmstat from time-to-time, and I have yet
to see the machine get CPU-bound. Haven't really paid much attention to
memory. Is there anything in partucular you're looking for? I can log
vmstat for the next set of runs (with a scaling factor of 10000). I plan
on doing those runs tonight...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
From | Date | Subject | |
---|---|---|---|
Next Message | Jim C. Nasby | 2006-05-19 18:44:11 | Re: [OT] MySQL is bad, but THIS bad? |
Previous Message | Marc Munro | 2006-05-19 18:25:21 | Re: New feature proposal |