GiST -- making my index faster makes is slower

From: David Blasby <dblasby(at)refractions(dot)net>
To: pgsql-hackers(at)postgresql(dot)org, PostGIS Development Discussion <postgis-devel(at)postgis(dot)refractions(dot)net>
Subject: GiST -- making my index faster makes is slower
Date: 2004-04-16 18:57:08
Message-ID: 40802C84.9070405@refractions.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I just tried an experiment that I thought would improve the performance
of the PostGIS spatial index.

The old way used a BOX (bounding box of 4 doubles) as the key in the index.

The new way uses a bounding box of 4 single-precision floats (its only
16 bytes long - a BOX is 32). I thought that it would significantly
reduce the size of the index (it did) and that the indexing would be
faster since there's less disk pages to fiddle through and these pages
would be more likely to fit in the disk cache.

It turns out this is not the case - its significantly slower. I think
it to do with more keys fitting in the leaf tuples.

As far as I can tell, the GiST index looks through the internal nodes,
then when it hits a leaf node, it search through each of the keys in the
leaf.

I save time searching through the internal nodes (because there's less
of them) - this is O(log n) saving.

I lose time searching through the leafs (because there's more keys in a
leaf) - this is O(n) cost.

For a query that does a nested loop of 10,000 index scans (on the same
table), the old method takes about 2 second, the new "faster" method
takes about 20 seconds.

I'm still trying to verify that this is the actual reason why its slower
- anyone have any other ideas? Anyone know a good way to profile this?

dave

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gaetano Mendola 2004-04-16 19:25:52 Re: 7.5 beta version]
Previous Message Josh Berkus 2004-04-16 17:36:23 Re: Socket communication for contrib