Re: Zedstore - compressed in-core columnar storage

From: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-15 16:15:51
Message-ID: CALfoeitV6Hj-_JHxQXoDERs=s0R=whAGYJz7Gv=g5t1z8_DKRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 13, 2019 at 4:22 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Thu, Apr 11, 2019 at 6:06 AM Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
> wrote:
> > Reading about it reminds me of this work -- TAG column storage(
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www09.sigmod.org_sigmod_record_issues_0703_03.article-2Dgraefe.pdf&d=DwIBaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=gxIaqms7ncm0pvqXLI_xjkgwSStxAET2rnZQpzba2KM&m=H2hOVqCm9svWVOW1xh7FhoURKEP-WWpWso6lKD1fLoM&s=KNOse_VUg9-BW7SyDXt1vw92n6x_B92N9SJHZKrdoIo&e=
> ).
> > Isn't this storage system inspired from there, with TID as the TAG?
> >
> > It is not referenced here so made me wonder.
>
> I don't think they're particularly similar, because that paper
> describes an architecture based on using purely logical row
> identifiers, which is not what a TID is. TID is a hybrid
> physical/logical identifier, sometimes called a "physiological"
> identifier, which will have significant overhead.

Storage system wasn't inspired by that paper, but yes seems it also talks
about laying out column data in btrees, which is good to see. But yes as
pointed out by Peter, the main aspect the paper is focusing on to save
space for TAG, isn't something zedstore plan's to leverage, it being more
restrictive. As discussed below we can use other alternatives to save space.

> Ashwin said that
> ZedStore TIDs are logical identifiers, but I don't see how that's
> compatible with a hybrid row/column design (unless you map heap TID to
> logical row identifier using a separate B-Tree).
>

Would like to know more specifics on this Peter. We may be having different
context on hybrid row/column design. When we referenced design supports
hybrid row/column families, it meant not within same table. So, not inside
a table one can have some data in row and some in column nature. For a
table, the structure will be homogenous. But it can easily support storing
all the columns together, or subset of columns together or single column
all connected together by TID.

> The big idea with Graefe's TAG design is that there is practically no
> storage overhead for these logical identifiers, because each entry's
> identifier is calculated by adding its slot number to the page's
> tag/low key. The ZedStore design, in contrast, explicitly stores TID
> for every entry. ZedStore seems more flexible for that reason, but at
> the same time the per-datum overhead seems very high to me. Maybe
> prefix compression could help here, which a low key and high key can
> do rather well.
>

Yes, the plan to optimize out TID space per datum, either by prefix
compression or delta compression or some other trick.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-04-15 16:21:44 Re: Multivariate MCV lists -- pg_mcv_list_items() seems to be broken
Previous Message Tom Lane 2019-04-15 16:07:25 Re: plpgsql - execute - cannot use a reference to record field