From: | Gabriele Bartolini <angusgb(at)tin(dot)it> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Normal case or bad query plan? |
Date: | 2004-10-11 21:05:59 |
Message-ID: | 6.1.2.0.2.20041011224414.01fadec0@box.tin.it |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi guys,
please consider this scenario. I have this table:
CREATE TABLE ip2location (
ip_address_from BIGINT NOT NULL,
ip_address_to BIGINT NOT NULL,
id_location BIGINT NOT NULL,
PRIMARY KEY (ip_address_from, ip_address_to)
);
I created a cluster on its primary key, by running:
CLUSTER ip2location_ip_address_from_key ON ip2location;
This allowed me to organise data in a more efficient way: the data that is
contained are ranges of IP addresses with empty intersections; for every IP
class there is a related location's ID. The total number of entries is 1392443.
For every IP address I have, an application retrieves the corresponding
location's id from the above table, by running a query like:
SELECT id_location FROM ip2location WHERE '11020000111' >= ip_address_from
AND '11020000111' <= ip_address_to;
For instance, by running the 'EXPLAIN ANALYSE' command, I get this "funny"
result:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Seq Scan on ip2location (cost=0.00..30490.65 rows=124781 width=8)
(actual time=5338.120..40237.283 rows=1 loops=1)
Filter: ((1040878301::bigint >= ip_address_from) AND
(1040878301::bigint <= ip_address_to))
Total runtime: 40237.424 ms
With other data, that returns an empty set, I get:
explain SELECT id_location FROM ip2location WHERE '11020000111' >=
ip_address_from AND '11020000111' <= ip_address_to;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Index Scan using ip2location_ip_address_from_key on
ip2location (cost=0.00..419.16 rows=140 width=8)
Index Cond: ((11020000111::bigint >= ip_address_from) AND
(11020000111::bigint <= ip_address_to))
I guess the planner chooses the best of the available options for the first
case, the sequential scan. This is not confirmed though by the fact that,
after I ran "SET enable_scan TO off", I got this:
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using ip2location_ip_address_from_key on
ip2location (cost=0.00..31505.73 rows=124781 width=8) (actual
time=2780.172..2780.185 rows=1 loops=1)
Index Cond: ((1040878301::bigint >= ip_address_from) AND
(1040878301::bigint <= ip_address_to))
Total runtime: 2780.359 ms
Is this a normal case or should I worry? What am I missing? Do you have any
suggestion or comment to do (that would be extremely appreciated)? Is the
CLUSTER I created worthwhile or not?
Thank you,
-Gabriele
--
Gabriele Bartolini: Web Programmer, ht://Dig & IWA/HWG Member, ht://Check
maintainer
Current Location: Prato, Toscana, Italia
angusgb(at)tin(dot)it | http://www.prato.linux.it/~gbartolini | ICQ#129221447
> "Leave every hope, ye who enter!", Dante Alighieri, Divine Comedy, The
Inferno
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/plain | 169 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Kris Jurka | 2004-10-11 21:17:24 | Re: Normal case or bad query plan? |
Previous Message | Francisco Reyes | 2004-10-11 21:04:16 | Understanding explains |