From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, tomas(dot)vondra(at)2ndquadrant(dot)com, Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Subject: | Re: On columnar storage (2) |
Date: | 2015-12-28 19:15:31 |
Message-ID: | 20151228191531.GO58441@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Konstantin Knizhnik wrote:
Hi,
> May be you know, that I have implemented IMCS (in-memory-columnar-store) as
> PostgreSQL extension.
> It was not so successful, mostly because people prefer to use standard SQL
> rather than using some special functions for accessing columnar storage
> (CS). Now I am thinking about second reincarnation of IMCS, based on FDW and
> CSP (custom nodes). This is why I am very interested in your patch.
Great to hear.
> I have investigated previous version of the patch and have some
> questions. I will be pleased if you can clarify them to me:
>
> 1. CS API.
> I agree with you that FDW API seems to be not enough to efficiently support
> work with CS.
> At least we need batch insert.
> But may be it is better to extend FDW API rather than creating special API
> for CS?
The patch we have proposed thus far does not mess with executor
structure too much, so probably it would be possible to add some things
here and there to the FDW API and it might work. But in the long term I
think the columnar storage project is more ambitious; for instance, I'm
sure we will want to be able to vectorise certain operations, and the
FDW API will become a bottleneck, so to speak. I'm thinking in
vectorisation in two different ways: one is that some operations such as
computing aggregates over large data sets can work a lot faster if you
feed the value of one column for multiple tuples at a time in columnar
format; that way you can execute the operation directly in the CPU
(this requires specific support from the aggregate functions.)
For this to work, the executor needs to be rejigged so that multiple
values (tuples) can be passed at once.
The other aspect of vectorisation is that one input tuple might have
been split in several data origins, so that one half of the tuple is in
columnar format and another format is in row format; that lets you do
very fast updates on the row-formatted part, while allowing fast reads
for the columnar format, for instance. (It's well known that columnar
oriented storage does not go well with updates; some implementation even
disallow updates and deletes altogether.) Currently within the executor
a tuple is a TupleTableSlot which contains one Datum array, which has
all the values coming out of the HeapTuple; but for split storage
tuples, we will need to have a TupleTableSlot that has multiple "Datum
arrays" (in a way --- because, actually, once we get to vectorise as in
the preceding paragraph, we no longer have a Datum array, but some more
complex representation).
I think that trying to make the FDW API address all these concerns,
while at the same time *also* serving the needs of external data
sources, insanity will ensue.
> 2. Horizontal<->Vertical data mapping. As far as I understand this patch,
> the model of CS assumes that some table columns are stored in horizontal
> format (in heap), some - in vertical format (in CS). And there is
> one-to-one mapping between horizontal and vertical parts of row using CTID.
Yes, that part needs to go away. We will deal with this eventually; the
patch I posted was just some very basic infrastructure. In the future
we would like to be able to have real support for not having to
translate between column-oriented and row-oriented formats; at least for
some operations. (I expect that we will leave most code as currently
and require translation, while other parts that have been optimized are
able to skip the translation step. As things mature we make more things
understand the new format without translation.) This is also dependent
on being able to vectorise the executor.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2015-12-28 19:35:27 | Re: On columnar storage (2) |
Previous Message | Tom Lane | 2015-12-28 18:38:11 | Re: Fix compiler warnings in Cube Extension |