From: | Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl> |
---|---|
To: | kaifeng(dot)zhu <cafeeee(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Bad query performance with more conditions? |
Date: | 2010-06-24 09:30:00 |
Message-ID: | 71317720-55C7-4BB4-AA74-90EEB91086FF@solfertje.student.utwente.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 24 Jun 2010, at 11:13, kaifeng.zhu wrote:
> <resend to mailing list>
>
> On Thu, Jun 24, 2010 at 16:57, Thom Brown <thombrown(at)gmail(dot)com> wrote:
>> Sounds like the planner took a wrong turn in the 2nd case. Which
>> version of PostgreSQL are you running?
>
> PostgreSQL version 8.1.21 (With schemas)
>
>
> The explain commands show that:
For a next time, explain analyse would have been more useful.
>
> db1=> explain select * from emails where email_msg_id =
> '4dba381b-f55e-02d3-4b4a-95e2e98178e2' limit 1;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..4.00 rows=1 width=184)
> -> Index Scan using idx_emails_email_msg_id on emails
> (cost=0.00..370016.42 rows=92413 width=184)
I think your problem is here. I assume that msg-id's are fairly unique between messages, but the planner expects there are over 90,000 records matching this ID. Hence the planner thinks this index has a low selectivity.
> Index Cond: (email_msg_id =
> '4dba381b-f55e-02d3-4b4a-95e2e98178e2'::bpchar)
> (3 rows)
>
> db1=> explain select * from emails where email_msg_id =
> '4dba381b-f55e-02d3-4b4a-95e2e98178e2' and email_sender =
> 'sender(at)domain(dot)com' limit 1;
>
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..801.06 rows=1 width=184)
> -> Index Scan using idx_emails_email_sender on emails
> (cost=0.00..370089.46 rows=462 width=184)
For this index the planner only expects 462 rows, so it has a higher selectivity than the index on msg_id and therefore the planner prefers this index over the other one.
> Index Cond: (email_sender = 'sender(at)domain(dot)com'::text)
> Filter: (email_msg_id = '4dba381b-f55e-02d3-4b4a-95e2e98178e2'::bpchar)
Apparently there are quite a few records matching that sender, and the database has to check each of them whether they have the requested msg_id or not. Apparently the planner decided that a seqscan on the results would be the most efficient here - and it would indeed be fairly quick if there are only 462 rows with this sender.
In summary, I think your statistics are off. Do you vacuum frequently enough? Autovacuum helps here, but there have been large improvements to that in later versions.
Another approach would be an index on (email_sender, email_msg_id) - that would particularly help the second query and it shouldn't hurt queries on just email_sender much.
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.
!DSPAM:737,4c2325a8286216564294622!
From | Date | Subject | |
---|---|---|---|
Next Message | tv | 2010-06-24 10:40:36 | Re: Bad query performance with more conditions? |
Previous Message | Michelle Konzack | 2010-06-24 09:27:24 | HA for PostgreSQL (Auth-Server) |