| From: | wieck(at)debis(dot)com (Jan Wieck) |
|---|---|
| To: | tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane) |
| Cc: | wieck(at)debis(dot)com, zakkr(at)zf(dot)jcu(dot)cz, t-ishii(at)sra(dot)co(dot)jp, pgsql-hackers(at)postgreSQL(dot)org |
| Subject: | Re: [HACKERS] compression in LO and other fields |
| Date: | 1999-11-12 14:50:02 |
| Message-ID: | m11mI1a-0003kLC@orion.SAPserv.Hamburg.dsh.de |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Tom Lane wrote:
> wieck(at)debis(dot)com (Jan Wieck) writes:
> > Html input might be somewhat optimal for Adisak's storage
> > format, but taking into account that my source implementing
> > the type input and output functions is smaller than 600
> > lines, I think 11% difference to a gzip -9 is a good result
> > anyway.
>
> These strike me as very good results. I'm not at all sure that using
> gzip or bzip would give much better results in practice in Postgres,
> because those compressors are optimized for relatively large files,
> whereas a compressed-field datatype would likely be getting relatively
> small field values to work on. (So your test data set is probably a
> good one for our purposes --- do the numbers change if you exclude
> all the files over, say, 10K?)
Will give it a try.
> It occurred to me last night that applying compression to individual
> fields might not be the best approach. Certainly a "bytez" data type
> is the easiest thing to fit into the existing system, but it's leaving
> some space savings on the table. What about compressing the *whole*
> data contents of a tuple on-disk, as a single entity? That should save
> more space than field-by-field compression.
But it requires decompression of every tuple into palloc()'d
memory during heap access. AFAIK, the heap access routines
currently return a pointer to the tuple inside the shm
buffer. Don't know what it's performance impact would be.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Thomas Lockhart | 1999-11-12 14:52:58 | Re: internationalizing and etc.. |
| Previous Message | The Hermit Hacker | 1999-11-12 14:49:31 | Re: [HACKERS] compression in LO and other fields |