Re: [HACKERS] compression in LO and other fields

From: wieck(at)debis(dot)com (Jan Wieck)
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane)
Cc: wieck(at)debis(dot)com, zakkr(at)zf(dot)jcu(dot)cz, t-ishii(at)sra(dot)co(dot)jp, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] compression in LO and other fields
Date: 1999-11-12 14:50:02
Message-ID: m11mI1a-0003kLC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

> wieck(at)debis(dot)com (Jan Wieck) writes:
> > Html input might be somewhat optimal for Adisak's storage
> > format, but taking into account that my source implementing
> > the type input and output functions is smaller than 600
> > lines, I think 11% difference to a gzip -9 is a good result
> > anyway.
>
> These strike me as very good results. I'm not at all sure that using
> gzip or bzip would give much better results in practice in Postgres,
> because those compressors are optimized for relatively large files,
> whereas a compressed-field datatype would likely be getting relatively
> small field values to work on. (So your test data set is probably a
> good one for our purposes --- do the numbers change if you exclude
> all the files over, say, 10K?)

Will give it a try.

> It occurred to me last night that applying compression to individual
> fields might not be the best approach. Certainly a "bytez" data type
> is the easiest thing to fit into the existing system, but it's leaving
> some space savings on the table. What about compressing the *whole*
> data contents of a tuple on-disk, as a single entity? That should save
> more space than field-by-field compression.

But it requires decompression of every tuple into palloc()'d
memory during heap access. AFAIK, the heap access routines
currently return a pointer to the tuple inside the shm
buffer. Don't know what it's performance impact would be.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Lockhart 1999-11-12 14:52:58 Re: internationalizing and etc..
Previous Message The Hermit Hacker 1999-11-12 14:49:31 Re: [HACKERS] compression in LO and other fields