From: | Kouber Saparev <kouber(at)saparev(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: LIMIT confuses the planner |
Date: | 2009-03-24 10:52:24 |
Message-ID: | 49C8BB68.9050004@saparev.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Now I am experiencing similar issue with another table, called
"message", for which there's a conditional index:
CREATE TABLE message (
message_sid SERIAL PRIMARY KEY,
from_profile_sid INT NOT NULL REFERENCES profile,
to_profile_sid INT NOT NULL REFERENCES profile,
sender_has_deleted BOOLEAN NOT NULL DEFAULT FALSE,
receiver_has_deleted BOOLEAN NOT NULL DEFAULT FALSE,
datetime TIMESTAMP NOT NULL DEFAULT NOW(),
body TEXT
);
CREATE INDEX message_from_profile_idx (from_profile_sid) WHERE NOT
sender_has_deleted;
So, again... adding a LIMIT makes the planner choose the "wrong" index.
db=# EXPLAIN ANALYZE SELECT
message_sid
FROM
message
WHERE
from_profile_sid = 1296 AND NOT sender_has_deleted
ORDER BY
message_sid DESC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=2307.70..2310.74 rows=1215 width=4) (actual
time=0.040..0.040 rows=2 loops=1)
Sort Key: message_sid
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on message (cost=23.59..2245.45 rows=1215
width=4) (actual time=0.029..0.033 rows=2 loops=1)
Recheck Cond: ((from_profile_sid = 1296) AND (NOT
sender_has_deleted))
-> Bitmap Index Scan on message_from_profile_idx
(cost=0.00..23.28 rows=1215 width=0) (actual time=0.022..0.022 rows=2
loops=1)
Index Cond: (from_profile_sid = 1296)
Total runtime: 0.068 ms
(8 rows)
db=# EXPLAIN ANALYZE SELECT
message_sid
FROM
message
WHERE
from_profile_sid = 1296 AND NOT sender_has_deleted
ORDER BY
message_sid DESC LIMIT 20;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1461.12 rows=20 width=4) (actual
time=0.817..932.398 rows=2 loops=1)
-> Index Scan Backward using message_pkey on message
(cost=0.00..88762.80 rows=1215 width=4) (actual time=0.816..932.395
rows=2 loops=1)
Filter: ((NOT sender_has_deleted) AND (from_profile_sid = 1296))
Total runtime: 932.432 ms
(4 rows)
I had already increased STATISTICS to 1000 for both from_profile_sid and
sender_has_deleted, and vacuum analyzed respectively (also did reindex),
but still statistical data is confusing me:
db=# SELECT n_distinct FROM pg_stats WHERE tablename='message' AND
attname='from_profile_sid';
n_distinct
------------
4068
(1 row)
db=# select avg(length) from (select from_profile_sid, count(*) as
length from message group by from_profile_sid) as freq;
avg
----------------------
206.5117822008693663
(1 row)
Any ideas/thoughts?
--
Kouber Saparev
http://kouber.saparev.com
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2009-03-24 11:07:34 | Re: Why creating GIN table index is so slow than inserting data into empty table with the same index? |
Previous Message | Tom Lane | 2009-03-24 03:35:12 | Re: Why creating GIN table index is so slow than inserting data into empty table with the same index? |