Re: POSTGRES DB 3 800 000 rows table, speed up?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Klein Balázs <Balazs(dot)Klein(at)axelero(dot)hu>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: POSTGRES DB 3 800 000 rows table, speed up?
Date: 2005-12-29 03:21:44
Message-ID: 3473.1135826504@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

=?iso-8859-1?Q?Klein_Bal=E1zs?= <Balazs(dot)Klein(at)axelero(dot)hu> writes:
> Could you explain this a little bit more?
> What are the conditions of this situation that makes b-tree ineffective?

Well, what he's trying to do is (abstracting a little)

WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col

Given a btree index on low_bound_col, the best you could do with this is
scan all the index entries from the start of the index up to probe_value
... or about half the table, on average, which makes the index pretty
much useless. On the assumption that low_bound_col and high_bound_col
are usually close together, all of the useful hits will occur near the
end of that scan, or the beginning if you scan backwards --- but there's
no way to know when it's OK to stop looking.

Making a double-column index on (low_bound_col, high_bound_col) does
not improve the situation much, because the additional condition
high_bound_col >= probe_value doesn't let you avoid scanning small
values of low_bound_col. You might save some trips to the table proper
but you're still scanning half the index.

And of course indexing (high_bound_col, low_bound_col) isn't any better.

If you are willing to impose a hard-wired assumption about the possible
size of the low-bound-to-high-bound distance, you can extend the query
to something like

WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col
AND low_bound_col >= (probe_value - max_distance)

which creates an efficiently indexable range limitation on
low_bound_col. Of course this is a very sucky kluge.

You can do a lot better with index types that are designed for
two-dimensional data instead of one-dimensional data. Btree is
a great data structure for one-dimensional searches, but that
doesn't make it the answer to everything.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Qingqing Zhou 2005-12-29 03:55:43 Re: I want to know how to improve the security of postgresql
Previous Message Jonel Rienton 2005-12-29 02:44:21 Re: [Bulk] Re: Final stored procedure question, for now anyway