From: | wieck(at)debis(dot)com (Jan Wieck) |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane) |
Cc: | wieck(at)debis(dot)com, zakkr(at)zf(dot)jcu(dot)cz, t-ishii(at)sra(dot)co(dot)jp, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: [HACKERS] compression in LO and other fields |
Date: | 1999-11-12 14:50:02 |
Message-ID: | m11mI1a-0003kLC@orion.SAPserv.Hamburg.dsh.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> wieck(at)debis(dot)com (Jan Wieck) writes:
> > Html input might be somewhat optimal for Adisak's storage
> > format, but taking into account that my source implementing
> > the type input and output functions is smaller than 600
> > lines, I think 11% difference to a gzip -9 is a good result
> > anyway.
>
> These strike me as very good results. I'm not at all sure that using
> gzip or bzip would give much better results in practice in Postgres,
> because those compressors are optimized for relatively large files,
> whereas a compressed-field datatype would likely be getting relatively
> small field values to work on. (So your test data set is probably a
> good one for our purposes --- do the numbers change if you exclude
> all the files over, say, 10K?)
Will give it a try.
> It occurred to me last night that applying compression to individual
> fields might not be the best approach. Certainly a "bytez" data type
> is the easiest thing to fit into the existing system, but it's leaving
> some space savings on the table. What about compressing the *whole*
> data contents of a tuple on-disk, as a single entity? That should save
> more space than field-by-field compression.
But it requires decompression of every tuple into palloc()'d
memory during heap access. AFAIK, the heap access routines
currently return a pointer to the tuple inside the shm
buffer. Don't know what it's performance impact would be.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Lockhart | 1999-11-12 14:52:58 | Re: internationalizing and etc.. |
Previous Message | The Hermit Hacker | 1999-11-12 14:49:31 | Re: [HACKERS] compression in LO and other fields |