Re: GiST index question: performance

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Steve Midgley <public(at)misuse(dot)org>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: GiST index question: performance
Date: 2007-03-05 20:21:55
Message-ID: Pine.LNX.4.64.0703052318020.400@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Mon, 5 Mar 2007, Steve Midgley wrote:

> Hi,
>
> First off, can I say how much I love GiST? It's already solved a few problems
> for me that seemed impossible to solve in real-time queries. Thanks to
> everyone who works on that project!

Thanks, Steve !

>
> I'm developing a geographic index based on a set of zip code boundaries.
> Points of interest (POI) will fall within some boundaries and not others. I
> need to search to find which POI are within a specified boundary.

You POI is what we call ConeSearch query in astronomy.
Please, take a look on Q3C algorithm available from http://q3c.sf.net.
Some information
http://www.sai.msu.su/~megera/wiki/SkyPixelization

This is what we use in our Virtual Observatory project and we're able to
work with 10^9 objects on moderate hardware. It doesn't use GiST but
special pixelization scheme allow to use standard Btree.

>
> I think have two options (see below) and I'm wondering if anyone has an
> opinion or experience as to whether one or the other will have substantially
> different performance characteristics. I can obviously test when I get that
> far, but I'd prefer to try the anticipated faster route first, if anyone has
> existing experience they can share:
>
> 1) Index a series of circles of NN radius around each boundary marker
> (lat/long point). Run a search on POI for those that fall within any of the
> specified circles.
>
> 2) Index a set of polygons that mark the "minimum area" around the boundary
> markers in question. Run a search on POI that fall within this single
> polygon.
>
> The polygon will have more points, but there will be more circles to search -
> my understanding of GiST is limited so I'm not sure if there's a performance
> benefit to searching many circles or a few polygons.
>
> My tables are of this size:
>
> # of POI: 50,000
> # of zip blocks (with and without regions): 217,000
> # of zip blocks in a given city (and hence in a given polygon): ~5
>
> Any thoughts or ideas?
>
> Thank you,
>
> Steve
>
> p.s. I could use a GIS system alongside of Postgres but performance and
> efficiency are key to this system, and it seems to me that raw GiST indexed
> SQL queries are going to be fastest and create the lowest load on the server?
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Steve Midgley 2007-03-05 21:23:28 Re: GiST index question: performance
Previous Message Steve Midgley 2007-03-05 19:12:31 GiST index question: performance