Re: Zedstore - compressed in-core columnar storage

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-14 16:22:10
Message-ID: 20190414162210.6djspkmbspbd4els@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 09, 2019 at 02:29:09PM -0400, Robert Haas wrote:
>On Tue, Apr 9, 2019 at 11:51 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>> This is not surprising, considering that columnar store is precisely the
>> reason for starting the work on table AMs.
>>
>> We should certainly look into integrating some sort of columnar storage
>> in mainline. Not sure which of zedstore or VOPS is the best candidate,
>> or maybe we'll have some other proposal. My feeling is that having more
>> than one is not useful; if there are optimizations to one that can be
>> borrowed from the other, let's do that instead of duplicating effort.
>
>I think that conclusion may be premature. There seem to be a bunch of
>different ways of doing columnar storage, so I don't know how we can
>be sure that one size will fit all, or that the first thing we accept
>will be the best thing.
>
>Of course, we probably do not want to accept a ton of storage manager
>implementations is core. I think if people propose implementations
>that are poor quality, or missing important features, or don't have
>significantly different use cases from the ones we've already got,
>it's reasonable to reject those. But I wouldn't be prepared to say
>that if we have two significantly different column store that are both
>awesome code with a complete feature set and significantly disjoint
>use cases, we should reject the second one just because it is also a
>column store. I think that won't get out of control because few
>people will be able to produce really high-quality implementations.
>
>This stuff is hard, which I think is also why we only have 6.5 index
>AMs in core after many, many years. And our standards have gone up
>over the years - not all of those would pass muster if they were
>proposed today.
>

It's not clear to me whether you're arguing for not having any such
implementation in core, or having multiple ones? I think we should aim
to have at least one in-core implementation, even if it's not the best
possible one for all sizes. It's not like our rowstore is the best
possible implementation for all cases either.

I think having a colstore in core is important not just for adoption,
but also for testing and development of the executor / planner bits.

If we have multiple candidates with sufficient code quality, then we may
consider including both. I don't think it's very likely to happen in the
same release, considering how much work it will require. And I have no
idea if zedstore or VOPS are / will be the only candidates - it's way
too early at this point.

FWIW I personally plan to focus primarily on the features that aim to
be included in core, and that applies to colstores too.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-04-14 16:26:45 Re: Zedstore - compressed in-core columnar storage
Previous Message Andres Freund 2019-04-14 16:10:29 Re: pg_dump is broken for partition tablespaces