Re: Strange issue with GiST index scan taking far too long

From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-10 09:16:51
Message-ID: 484E4683.7030501@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

>> So you are saying it is de-toasted 32880 times, in this case? If not,
>> where are the repeated de-toastings happening?
>
> Inside the index support functions. I'm thinking we could fix this by
> forcibly detoasting values passed as index scan keys, but it's not quite
> clear where's the best place to do that.

Ouch. This is rapidly getting out of my sphere of knowledge, but I'd
guess you'd want to do this either just before you start the index scan,
or cache the results within the AM after the first deTOASTing.

In terms of PostGIS, we tend to do a lot of index queries against large
geometries so we see cases like this frequently - so optimising them
would be good.

I did think of another idea though: at the moment all members of the
GiST opclass for geometry objects are declared using the geometry type
(which contains the entire geometry), whereas individual entries are
stored within the index as box2d objects representing just their
bounding box.

Would it make sense to rework the GiST routines so that instead of
accepting geometry <op> geometry, they accept box2d <op> box2d? Then
surely if we add a CAST from geometry to box2d then the geometry would
get converted to its bounding box (which would not require deTOASTing)
before being used as an index scan key.

ATB,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2008-06-10 09:43:51 Re: Core team statement on replication in PostgreSQL
Previous Message Zeugswetter Andreas OSB sIT 2008-06-10 09:01:02 Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics