Re: logical column ordering

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical column ordering
Date: 2015-04-14 18:38:41
Message-ID: 20150414183841.GW4369@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been looking at this again. It has become apparent to me that what
we're doing in parse analysis is wrong, and the underlying node
representation is wrong too. Here's a different approach, which I hope
will give better fruits. I'm still working on implementing the ideas
here (and figuring out what the fallout is).

Currently, the patch stores RangeTblEntry->eref->colnames in logical
order; and it also adds a "map" from logical colnums to "attnum" (called
"lognums"). Now this is problematic for two reasons:

1. the lognums map becomes part of the stored representation of a rule;
any time you modified the logical ordering of a table underlying some
view, the view's _RETURN rule would have to be modified as well. Not
good.

2. RTE->eref->colnames is in attlognum order and thus can only be sanely
interpreted if RTE->lognums is available, so not only lognums would have
to be modified, but colnames as well.

I think the solution to both these issues is to store colnames in attnum
ordering not logical, and *not* output RTE->lognums as part of
_outRangeTblEntry. This means that every time we read the RTE for the
table we need to obtain lognums from its tupledesc. RTE->eref->colnames
can then be sorted appropriately at plan time.

At RTE creation time (addRangeTableEntry and siblings) we can obtain
lognums and physnums. Both these arrays are available for later
application in setrefs.c, avoiding the need of the obviously misplaced
relation_open() call we currently have there.

There is one gotcha, which is that expandTupleDesc (and, really,
everything from expandRTE downwards) will need to be split in two
somehow: one part needs to fill in the colnames array in attnum order,
and the other part needs to expand the attribute array into Var nodes in
logical order.

(If you recall, we need attphysnums at setrefs.c time so that we can
fix-up any TupleDesc created from a targetlist so that it contains the
proper attphysnum values. The attphysnum values for each attribute do
not propagate properly there, and I believe this is the mechanism to do
so.)

As I said, I'm still writing the first pieces of this so I'm not sure
what other ramifications it will have. If there are any thoughts, I
would appreciate them. (Particularly useful input on whether it is
acceptable to omit lognums/physnums from _outRangeTblEntry.)

An alternative idea would be to add lognums and physnums to RelOptInfo
instead of RangeTblEntry (we would do so during get_relation_info). I'm
not sure how this works for setrefs.c though, if at all; the advantage
is that RelOptInfo is not part of stored rules so we don't have to worry
about not saving them there.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-04-14 18:52:18 Re: inherit support for foreign tables
Previous Message Heikki Linnakangas 2015-04-14 18:05:34 Re: [COMMITTERS] pgsql: Use Intel SSE 4.2 CRC instructions where available.