Query plan choice issue

From: Yaroslav Tykhiy <yar(at)barnet(dot)com(dot)au>
To: pgsql-general(at)postgresql(dot)org
Subject: Query plan choice issue
Date: 2010-09-12 23:36:35
Message-ID: 2630508E-BB4C-4B1F-9FD8-40E87684A403@barnet.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

I'm seeing a funny behaviour in Postgresql 8.4.4. Namely, a query can
be executed using either of two different query plans, one taking a
few milliseconds and the other, tens of seconds. The work_mem setting
doesn't seem to affect it -- tried to increase or decrease it by 2 or
4 times, but it didn't seem to favour the fast plan choice. Honestly,
I have no idea what affects the plan choice, but I saw Postgresql
change it at random.

The query in question looks like this -- sorry, it's rather complex:

SELECT message_idnr
FROM dbmail_messages m
JOIN dbmail_physmessage p ON m.physmessage_id=p.id
JOIN dbmail_headervalue v ON v.physmessage_id=p.id
JOIN dbmail_headername n ON v.headername_id=n.id
WHERE mailbox_idnr = 12345 AND status IN (0,1) AND
headername ILIKE 'MESSAGE-ID' AND SUBSTRING(headervalue,0,255)
ILIKE '%<(dot)(dot)(dot)(at)mail(dot)gmail(dot)com>%'
ORDER BY message_idnr;

It comes from DBMail. That said, I don't think DBMail can be blamed
here because at least sometimes Postgresql is able to do the right
thing with respect to that query.

Here is the slow plan:
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1234610.11..1234610.11 rows=1 width=8) (actual
time=20933.166..20933.168 rows=1 loops=1)
Sort Key: m.message_idnr
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=0.00..1234610.10 rows=1 width=8) (actual
time=3327.658..20933.122 rows=1 loops=1)
-> Nested Loop (cost=0.00..1234601.59 rows=1 width=24)
(actual time=3327.599..20933.056 rows=1 loops=1)
-> Nested Loop (cost=0.00..1234549.74 rows=6 width=8)
(actual time=3327.503..20932.944 rows=1 loops=1)
Join Filter: (v.headername_id = n.id)
-> Seq Scan on dbmail_headername n
(cost=0.00..108.46 rows=1 width=8) (actual time=0.027..0.985 rows=1
loops=1)
Filter: ((headername)::text ~~* 'MESSAGE-
ID'::text)
-> Seq Scan on dbmail_headervalue v
(cost=0.00..1234407.96 rows=2666 width=16) (actual
time=3327.465..20931.942 rows=1 loops=1)
Filter: ("substring"(v.headervalue, 0,
255) ~~* '%<(dot)(dot)(dot)(at)mail(dot)gmail(dot)com>%'::text)
-> Index Scan using dbmail_messages_physmessage_idx on
dbmail_messages m (cost=0.00..8.63 rows=1 width=16) (actual
time=0.088..0.100 rows=1 loops=1)
Index Cond: (m.physmessage_id = v.physmessage_id)
Filter: ((m.status = ANY ('{0,1}'::integer[]))
AND (m.mailbox_idnr = 12345))
-> Index Scan using dbmail_physmessage_pkey on
dbmail_physmessage p (cost=0.00..8.50 rows=1 width=8) (actual
time=0.048..0.050 rows=1 loops=1)
Index Cond: (p.id = m.physmessage_id)
Total runtime: 20933.241 ms
(17 rows)

And here is the fast plan:
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1175284.93..1175284.94 rows=1 width=8) (actual
time=14.163..14.165 rows=1 loops=1)
Sort Key: m.message_idnr
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=0.00..1175284.92 rows=1 width=8) (actual
time=4.272..14.152 rows=1 loops=1)
-> Nested Loop (cost=0.00..1175276.41 rows=1 width=24)
(actual time=4.261..14.135 rows=1 loops=1)
-> Nested Loop (cost=0.00..1175268.46 rows=1 width=32)
(actual time=4.249..14.117 rows=1 loops=1)
-> Index Scan using dbmail_messages_mailbox_idx
on dbmail_messages m (cost=0.00..4153.35 rows=786 width=16) (actual
time=0.043..2.810 rows=358 loops=1)
Index Cond: (mailbox_idnr = 12345)
Filter: (status = ANY ('{0,1}'::integer[]))
-> Index Scan using dbmail_headervalue_2 on
dbmail_headervalue v (cost=0.00..1489.96 rows=1 width=16) (actual
time=0.028..0.029 rows=0 loops=358)
Index Cond: (v.physmessage_id =
m.physmessage_id)
Filter: ("substring"(v.headervalue, 0,
255) ~~* '%<(dot)(dot)(dot)(at)mail(dot)gmail(dot)com>%'::text)
-> Index Scan using dbmail_headername_pkey on
dbmail_headername n (cost=0.00..7.94 rows=1 width=8) (actual
time=0.007..0.008 rows=1 loops=1)
Index Cond: (n.id = v.headername_id)
Filter: ((n.headername)::text ~~* 'MESSAGE-
ID'::text)
-> Index Scan using dbmail_physmessage_pkey on
dbmail_physmessage p (cost=0.00..8.50 rows=1 width=8) (actual
time=0.006..0.007 rows=1 loops=1)
Index Cond: (p.id = m.physmessage_id)
Total runtime: 14.231 ms
(18 rows)

Do you think this query plan choice can be affected by postgresql.conf
settings? Or shall I file a bug report?

I'll be happy to provide the table and index structure details if
needed.

Thanks!

Yar

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martin Gainty 2010-09-13 00:49:07 Re: Query plan choice issue
Previous Message Tom Lane 2010-09-12 22:44:26 Re: Huge amount of memory errors with libpq