From: | Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Re: [HACKERS] Custom compression methods |
Date: | 2020-08-31 05:14:25 |
Message-ID: | CAJ3gD9c370yZFgHRAv+Up987o=t94sr22R36T=BKX5=_fw=TzA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 13 Aug 2020 at 17:18, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> I have rebased the patch on the latest head and currently, broken into 3 parts.
>
> v1-0001: As suggested by Robert, it provides the syntax support for
> setting the compression method for a column while creating a table and
> adding columns. However, we don't support changing the compression
> method for the existing column. As part of this patch, there is only
> one built-in compression method that can be set (pglz). In this, we
> have one in-build am (pglz) and the compressed attributes will directly
> store the oid of the AM. In this patch, I have removed the
> pg_attr_compresion as we don't support changing the compression
> for the existing column so we don't need to preserve the old
> compressions.
> v1-0002: Add another built-in compression method (zlib)
> v1:0003: Remaining patch set (nothing is changed except rebase on the
> current head, stabilizing check-world and 0001 and 0002 are pulled
> out of this)
>
> Next, I will be working on separating out the remaining patches as per
> the suggestion by Robert.
Thanks for this new feature. Looks promising and very useful, with so
many good compression libraries already available.
I see that with the patch-set, I would be able to create an extension
that defines a PostgreSQL C handler function which assigns all the
required hook function implementations for compressing, decompressing
and validating, etc. In short, I would be able to use a completely
different compression algorithm to compress toast data if I write such
an extension. Correct me if I am wrong with my interpretation.
Just a quick superficial set of review comments ....
A minor re-base is required due to a conflict in a regression test
-------------
In heap_toast_insert_or_update() and in other places, the comments for
new parameter preserved_am_info are missing.
-------------
+toast_compress_datum(Datum value, Oid acoid)
{
struct varlena *tmp = NULL;
int32 valsize;
- CompressionAmOptions cmoptions;
+ CompressionAmOptions *cmoptions = NULL;
I think tmp and cmoptions need not be initialized to NULL
-------------
- TOAST_COMPRESS_SET_RAWSIZE(tmp, valsize);
- SET_VARSIZE_COMPRESSED(tmp, len + TOAST_COMPRESS_HDRSZ);
/* successful compression */
+ toast_set_compressed_datum_info(tmp, amoid, valsize);
return PointerGetDatum(tmp);
Any particular reason why is this code put in a new extern function ?
Is there a plan to re-use it ? Otherwise, it's not necessary to do
this.
------------
Also, not sure why "HTAB *amoptions_cache" and "MemoryContext
amoptions_cache_mcxt" aren't static declarations. They are being used
only in toast_internals.c
-----------
The tab-completion doesn't show COMPRESSION :
postgres=# create access method my_method TYPE
INDEX TABLE
postgres=# create access method my_method TYPE
Also, the below syntax also would better be tab-completed so as to
display all the installed compression methods, in line with how we
show all the storage methods like plain,extended,etc:
postgres=# ALTER TABLE lztab ALTER COLUMN t SET COMPRESSION
------------
I could see the differences in compression ratio, and the compression
and decompression speed when I use lz versus zib :
CREATE TABLE zlibtab(t TEXT COMPRESSION zlib WITH (level '4'));
create table lztab(t text);
ALTER TABLE lztab ALTER COLUMN t SET COMPRESSION pglz;
pgg:s2:pg$ time psql -c "\copy zlibtab from text.data"
COPY 13050
real 0m1.344s
user 0m0.031s
sys 0m0.026s
pgg:s2:pg$ time psql -c "\copy lztab from text.data"
COPY 13050
real 0m2.088s
user 0m0.008s
sys 0m0.050s
pgg:s2:pg$ time psql -c "select pg_table_size('zlibtab'::regclass),
pg_table_size('lztab'::regclass)"
pg_table_size | pg_table_size
---------------+---------------
1261568 | 1687552
pgg:s2:pg$ time psql -c "select NULL from zlibtab where t like '0000'"
> /dev/null
real 0m0.127s
user 0m0.000s
sys 0m0.002s
pgg:s2:pg$ time psql -c "select NULL from lztab where t like '0000'"
> /dev/null
real 0m0.050s
user 0m0.002s
sys 0m0.000s
--
Thanks,
-Amit Khandekar
Huawei Technologies
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-08-31 05:18:48 | Re: list of extended statistics on psql |
Previous Message | Rahila | 2020-08-31 05:06:13 | Re: More tests with USING INDEX replident and dropped indexes |