From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Klein Balázs <Balazs(dot)Klein(at)axelero(dot)hu> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: POSTGRES DB 3 800 000 rows table, speed up? |
Date: | 2005-12-29 03:21:44 |
Message-ID: | 3473.1135826504@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
=?iso-8859-1?Q?Klein_Bal=E1zs?= <Balazs(dot)Klein(at)axelero(dot)hu> writes:
> Could you explain this a little bit more?
> What are the conditions of this situation that makes b-tree ineffective?
Well, what he's trying to do is (abstracting a little)
WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col
Given a btree index on low_bound_col, the best you could do with this is
scan all the index entries from the start of the index up to probe_value
... or about half the table, on average, which makes the index pretty
much useless. On the assumption that low_bound_col and high_bound_col
are usually close together, all of the useful hits will occur near the
end of that scan, or the beginning if you scan backwards --- but there's
no way to know when it's OK to stop looking.
Making a double-column index on (low_bound_col, high_bound_col) does
not improve the situation much, because the additional condition
high_bound_col >= probe_value doesn't let you avoid scanning small
values of low_bound_col. You might save some trips to the table proper
but you're still scanning half the index.
And of course indexing (high_bound_col, low_bound_col) isn't any better.
If you are willing to impose a hard-wired assumption about the possible
size of the low-bound-to-high-bound distance, you can extend the query
to something like
WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col
AND low_bound_col >= (probe_value - max_distance)
which creates an efficiently indexable range limitation on
low_bound_col. Of course this is a very sucky kluge.
You can do a lot better with index types that are designed for
two-dimensional data instead of one-dimensional data. Btree is
a great data structure for one-dimensional searches, but that
doesn't make it the answer to everything.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Qingqing Zhou | 2005-12-29 03:55:43 | Re: I want to know how to improve the security of postgresql |
Previous Message | Jonel Rienton | 2005-12-29 02:44:21 | Re: [Bulk] Re: Final stored procedure question, for now anyway |