LZ compressing data type

From: wieck(at)debis(dot)com (Jan Wieck)
To: pgsql-hackers(at)postgreSQL(dot)org (PostgreSQL HACKERS)
Subject: LZ compressing data type
Date: 1999-11-17 22:38:36
Message-ID: m11oDim-0003kGC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I just committed some changes that require an initdb.

New are the discussed, simple LZ compressor, placed into
/utils/adt/pg_compress.c, and a new lztext data type based on
it. You'll find a fairly detailed description of the
compression algorithm in the comments at the top of
pg_lzcompress.c.

Not very surprisingly to me it turns out, that the compressor
does a very good job on rule action strings. I used the 48
rules that can be found in pg_rewrite after the regression
test. The original string sizes range from 820 to 4615 and
the compression rates from 35-76% with an average of 60%. The
4615 size rule action has been coded into a 1126
octet_length.

For the lztext type, there are conversion functions to/from
text and the length() and octet_length() functions available.
Length() returns the same as length on text would. While
octet_length returns the compressed size without VARHDRSZ.

The type does not support MULTIBYTE or CYR_ENCODE up to now.
It shouldn't be too hard to add it and after that, we might
add another lzbpchar type too. The latter is really
interesting, because an empty char(200) (thus containing 200
spaces) could result in an octet_length of 12 instead of 204
- that's a compression rate of 94.1%! It actually wouldn't,
because the compressors default is to start only if the input
is at least 256 bytes, but there is a mechanism so a lzbpchar
type could force this behaviour.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1999-11-17 23:50:44 Re: [HACKERS] regression tests
Previous Message Bruce Momjian 1999-11-17 20:56:23 Re: [HACKERS] regression tests