From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
Subject: | SP-GiST confusion: indexed column's type vs. index column type |
Date: | 2021-04-02 16:37:51 |
Message-ID: | 3728741.1617381471@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
While reviewing Pavel Borisov's patch to enable INCLUDE columns in
SP-GiST, I found some things that seem like pre-existing bugs.
These only accidentally fail to cause any problems in the existing
SP-GiST opclasses:
1. The attType passed to an opclass's config method is documented as
Oid attType; /* Data type to be indexed */
Now, I would read that as meaning the type of the underlying heap
column; the documentation and code about when a "compress" method
is required certainly seem to think so too. What is actually being
passed, though, is the data type of the index column, that is the
"opckeytype" of the index opclass. We've failed to notice this because
(1) for most of the core SP-GiST opclasses, the two types are the same,
and (2) none of the core opclasses bother to examine attType anyway.
2. When performing an index-only scan on an SP-GiST index, what we
pass back as the tuple descriptor of the output tuples is just the
index relation's own tupdesc, cf spgbeginscan:
/* Set up indexTupDesc and xs_hitupdesc in case it's an index-only scan */
so->indexTupDesc = scan->xs_hitupdesc = RelationGetDescr(rel);
Again, what this is going to report is the opckeytype, not the
reconstructed heap column datatype. That's just flat out wrong.
We've failed to notice because the only core opclass for which
those types are different is poly_ops, which does not support
reconstructing the polygons for index-only scan.
We need to do something about this because the INCLUDE patch needs
the relation descriptor of an SP-GiST index to match the reality
of what is stored in the leaf tuples. Right now, as far as I can tell,
there isn't really any necessary connection between the atttype
claimed by the relation descriptor and the leaf type that's physically
stored. They're accidentally the same in all existing opclasses,
but only accidentally.
I propose changing things so that
(A) attType really is the input (heap) data type.
(B) We enforce that leafType agrees with the opclass opckeytype,
ensuring the index tupdesc can be used for leaf tuples.
(C) The tupdesc passed back for an index-only scan reports the
input (heap) data type.
This might be too much change for the back branches. Given the
lack of complaints to date, I think we can just fix it in HEAD.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2021-04-02 16:46:28 | Re: SP-GiST confusion: indexed column's type vs. index column type |
Previous Message | Dmitry Dolgov | 2021-04-02 16:06:36 | Re: Asynchronous and "direct" IO support for PostgreSQL. |