From: | Jeffery Collins <collins(at)onyx-technologies(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | lztext and compression ratios... |
Date: | 2000-07-05 16:59:12 |
Message-ID: | 39636960.FEF2400C@onyx-technologies.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers pgsql-sql |
I have been looking at using the lztext type and I have some
questions/observations. Most of my experience comes from attempting to
compress text records in a different database (CTREE), but I think the
experience is transferable.
My typical table consists of variable length text records. The average
length record is around 1K bytes. I would like to compress my records
to save space and improve I/O performance (smaller records means more
records fit into the file system cache which means less I/O - or so the
theory goes). I am not too concerned about CPU as we are using a 4-way
Sun Enterprise class server. So compress seems like a good idea to me.
My experience with attempting to compress such a relatively small
(around 1K) text string is that the compression ration is not very
good. This is because the string is not long enough for the LZ
compression algorithm to establish really good compression patterns and
the fact that the de-compression table has to be built into each
record. What I have done in the past to get around these problems is
that I have "taught" the compression algorithm the patterns ahead of
time and stored the de-compression patterns in an external table. Using
this technique, I have achieved *much* better compression ratios.
So my questions/comments are:
- What are the typical compression rations on relatively small (i.e.
around 1K) strings seen with lztext?
- Does anyone see a need/use for a generalized string compression
type that can be "trained" external to the individual records?
- Am I crazy in even attempting to compress strings of this relative
size? My largest table correct contains about 2 million entries of
roughly 1k size strings or about 2Gig of data. If I could compress this
to about 33% of it's original size (not unreasonable with a trained LZ
compression), I would save a lot of disk space (not really important)
and a lot of file system cache space (very important) and be able to fit
the entire table into memory (very, very important).
Thank you,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Wieck | 2000-07-05 17:22:45 | Re: [HACKERS] Re: Revised Copyright: is this morepalatable? |
Previous Message | Jan Wieck | 2000-07-05 16:58:10 | Re: help -- cursor inside a function |
From | Date | Subject | |
---|---|---|---|
Next Message | Tim Perdue | 2000-07-05 17:00:39 | Re: Article on MySQL vs. Postgres |
Previous Message | Jan Wieck | 2000-07-05 16:56:30 | update on TOAST status |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2000-07-05 17:15:22 | Re: ERROR: ExecEvalAggref: no aggregates in this expression context |
Previous Message | Grigori Soloviov | 2000-07-05 15:16:13 | Help! PLPGSQL and PGSQL - does not support more than 8 arguments for a function? |