From: | Alexandra Wang <lewang(at)pivotal(dot)io> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Zedstore - compressed in-core columnar storage |
Date: | 2019-08-19 23:15:30 |
Message-ID: | CACiyaSr3EEMR=wjdhf9XZiBuOgB0bqdwjPyS5Yh63d-fpACBPQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Aug 18, 2019 at 12:35 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> . I was missing a way to check for compression ratio;
Here are the ways to check compression ratio for zedstore:
Table level:
select sum(uncompressedsz::numeric) / sum(totalsz) as compratio from
pg_zs_btree_pages(<tablename>);
Per column level:
select attno, count(*), sum(uncompressedsz::numeric) / sum(totalsz) as
compratio from pg_zs_btree_pages(<tablename>) group by attno order by attno;
> it looks like zedstore
> with lz4 gets ~4.6x for our largest customer's largest table. zfs using
> compress=gzip-1 gives 6x compression across all their partitioned
> tables,
> and I'm surprised it beats zedstore .
>
What kind of tables did you use? Is it possible to give us the schema
of the table? Did you perform 'INSERT INTO ... SELECT' or COPY?
Currently COPY give better compression ratios than single INSERT
because it generates less pages for meta data. Using the above per column
level compression ratio will provide which columns have lower
compression ratio.
We plan to add other compression algorithms like RLE and delta
encoding which should give better compression ratios for column store
along with LZ4.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2019-08-20 00:52:05 | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Previous Message | Thomas Munro | 2019-08-19 22:53:07 | Re: PANIC: could not flush dirty data: Operation not permitted power8, Redhat Centos |