From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Incorrect behaviour when using a GiST index on points |
Date: | 2012-10-18 19:18:28 |
Message-ID: | 20121018191828.GB10844@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 11, 2012 at 07:17:28AM -0400, Noah Misch wrote:
> On Tue, Oct 02, 2012 at 01:58:40PM -0400, Noah Misch wrote:
> > > > On Mon, Aug 27, 2012 at 7:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > >> There's also the big-picture question of whether we should just get rid
> > > >> of fuzzy comparisons in the geometric types instead of trying to hack
> > > >> indexes to work around them.
>
> > In any event, I think we should entertain a patch to make the GiST operator
> > class methods bug-compatible with corresponding operators. Even if we decide
> > to change operator behavior in HEAD, the back branches could use it.
>
> We have broad agreement that the specific implementation of fuzz in geometric
> comparison operators is shoddy, but nobody has voiced interest in designing a
> concrete improvement. I propose adding a TODO item "Remove or improve
> rounding in geometric comparison operators", endorsing Alexander's design, and
> reviewing his patch. Objections?
TODO added, and here's a review:
The patch adds no regression tests; it should add tests illustrating the
problems it fixes.
I audited the other indexable geometric operators for similar problems. This
passage in gist_point_consistent_internal(), which handles (point,point)
operators, caught my suspicion:
case RTSameStrategyNumber:
if (isLeaf)
{
result = FPeq(key->low.x, query->x)
&& FPeq(key->low.y, query->y);
}
else
{
result = (query->x <= key->high.x && query->x >= key->low.x &&
query->y <= key->high.y && query->y >= key->low.y);
}
break;
A leaf entry reachable from an internal entry may fall exactly on the
internal-entry bounding box. Since we would accept a fuzzy match at the leaf
level, I think we must also accept a fuzzy match at the internal level.
> *** a/src/backend/access/gist/gistproc.c
> --- b/src/backend/access/gist/gistproc.c
> ***************
> *** 1326,1331 **** gist_point_consistent(PG_FUNCTION_ARGS)
> --- 1327,1333 ----
> bool *recheck = (bool *) PG_GETARG_POINTER(4);
> bool result;
> StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
> + BOX *query, *key;
This function now has "query" variables within subsidiary blocks redundant
with and masking this one. Avoid doing that.
>
> switch (strategyGroup)
> {
> ***************
> *** 1337,1348 **** gist_point_consistent(PG_FUNCTION_ARGS)
> *recheck = false;
> break;
> case BoxStrategyNumberGroup:
> ! result = DatumGetBool(DirectFunctionCall5(
> ! gist_box_consistent,
> ! PointerGetDatum(entry),
> ! PG_GETARG_DATUM(1),
> ! Int16GetDatum(RTOverlapStrategyNumber),
> ! 0, PointerGetDatum(recheck)));
> break;
> case PolygonStrategyNumberGroup:
> {
> --- 1339,1356 ----
> *recheck = false;
> break;
> case BoxStrategyNumberGroup:
> ! /*
> ! * This code repeats logic of on_ob which uses simple comparison
> ! * rather than FP* functions.
> ! */
> ! query = PG_GETARG_BOX_P(1);
> ! key = DatumGetBoxP(entry->key);
> !
> ! *recheck = false;
> ! result = key->high.x >= query->low.x &&
> ! key->low.x <= query->high.x &&
> ! key->high.y >= query->low.y &&
> ! key->low.y <= query->high.y;
For leaf entries, this correctly degenerates to on_pb(). For internal
entries, it must, but does not, implement box_overlap(). (The fuzzy
box_overlap() would be fine.) I recommend making gist_point_consistent()'s
treatment of boxes resemble its treatment of circles and polygons; that eases
verifying their correctness. Call gist_box_consistent. Then, for leaf
entries, call box_contain_pt().
GiST "consistent" functions often validate the strategy number, but the
circle, polygon and box branches of gist_point_consistent silently assume
strategy % GeoStrategyNumberOffset == RTContainedByStrategyNumber. Should
they verify that assumption? I raise this as an incidental question; it is
not new with your patch.
Thanks,
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2012-10-18 19:36:26 | Re: pg_stat_lwlocks view - lwlocks statistics, round 2 |
Previous Message | Christopher Browne | 2012-10-18 19:18:19 | Re: [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility |