From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Zedstore - compressed in-core columnar storage |
Date: | 2019-04-10 07:48:22 |
Message-ID: | b4c776ce-ca3e-c30c-83d0-5820e19862de@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/04/2019 10:38, Konstantin Knizhnik wrote:
> I also a little bit confused about UNDO records and MVCC support in
> Zedstore. Actually columnar store is mostly needed for analytic for
> read-only or append-only data. One of the disadvantages of Postgres is
> quite larger per-record space overhead caused by MVCC.
> It may be critical if you want to store huge timeseries with relatively
> small number of columns (like measurements of some sensor).
> It will be nice to have storage format which reduce this overhead when
> it is not needed (data is not updated).
Sure. Definitely something we need to optimize.
> Right now, even without UNFO pages, size of zedstore is larger than size
> of original Postgres table.
> It seems to be very strange.
If you have a table with a lot of columns, but each column is small,
e.g. lots of boolean columns, the item headers that zedstore currently
stores for each datum take up a lot of space. We'll need to squeeze
those harder to make this competitive. Instead of storing a header for
each datum, if a group of consecutive tuples have the same visibility
information, we could store the header just once, with an array of the
datums, for example.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2019-04-10 08:03:21 | Re: hyrax vs. RelationBuildPartitionDesc |
Previous Message | Konstantin Knizhnik | 2019-04-10 07:38:13 | Re: Zedstore - compressed in-core columnar storage |