From: | David Blasby <dblasby(at)refractions(dot)net> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, PostGIS Development Discussion <postgis-devel(at)postgis(dot)refractions(dot)net> |
Subject: | GiST -- making my index faster makes is slower |
Date: | 2004-04-16 18:57:08 |
Message-ID: | 40802C84.9070405@refractions.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I just tried an experiment that I thought would improve the performance
of the PostGIS spatial index.
The old way used a BOX (bounding box of 4 doubles) as the key in the index.
The new way uses a bounding box of 4 single-precision floats (its only
16 bytes long - a BOX is 32). I thought that it would significantly
reduce the size of the index (it did) and that the indexing would be
faster since there's less disk pages to fiddle through and these pages
would be more likely to fit in the disk cache.
It turns out this is not the case - its significantly slower. I think
it to do with more keys fitting in the leaf tuples.
As far as I can tell, the GiST index looks through the internal nodes,
then when it hits a leaf node, it search through each of the keys in the
leaf.
I save time searching through the internal nodes (because there's less
of them) - this is O(log n) saving.
I lose time searching through the leafs (because there's more keys in a
leaf) - this is O(n) cost.
For a query that does a nested loop of 10,000 index scans (on the same
table), the old method takes about 2 second, the new "faster" method
takes about 20 seconds.
I'm still trying to verify that this is the actual reason why its slower
- anyone have any other ideas? Anyone know a good way to profile this?
dave
From | Date | Subject | |
---|---|---|---|
Next Message | Gaetano Mendola | 2004-04-16 19:25:52 | Re: 7.5 beta version] |
Previous Message | Josh Berkus | 2004-04-16 17:36:23 | Re: Socket communication for contrib |