Query palns and tug-of-war with enable_sort

From: Glyn Astill <glynastill(at)yahoo(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Query palns and tug-of-war with enable_sort
Date: 2009-02-18 14:27:39
Message-ID: 839213.35559.qm@web23606.mail.ird.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Chaps,

We have a legacy application that used to have it's own sequential database backend, and to fetch data out of it's tables commands such as "find gt table by index" would be used.

What we have now is a driver in the middle that constructs sql to access the data on pg8.3, typically of the form "SELECT ... FROM ... ORDER BY ... LIMIT n" and since we always have indexes that match the ORDER BY it creates I set enable_sort to off because in some rare cases the planner would choose a slower plan.

Reply with suitable comment about my foot-gun now if you're so inclined. But seeing as the purpose of our postgres installation is to replace that legacy backend for this application, and seeing as all the other queries I put together outside of thae application still picked good plans, I really wasn't too worried about this. We've been building lots of queries for over 5 months now, and this is the first time I've seen a non-ideal plan.

Here's the query:

SELECT DISTINCT mult_ref
FROM creditINNER JOIN mult_ord ON mult_ord.transno = credit.transno
WHERE (credit.show = 450000 OR credit.show = 450001)
AND credit."date" >= '2009-02-16'
AND credit."date" <= '2009-02-16'
AND credit.cancelled = ' '
ORDER BY mult_ref

With enable_sort on this is the plan it chooses:

HashAggregate (cost=14.72..14.73 rows=1 width=9)
-> Nested Loop (cost=0.00..14.72 rows=1 width=9)
-> Index Scan using credit_index02 on credit (cost=0.00..7.04 rows=1 width=9)
Index Cond: ((date >= '2009-02-16'::date) AND (date <= '2009-02-16'::date))
Filter: (((cancelled)::text = ' '::text) AND ((show = 450000::numeric) OR (show = 450
001::numeric)))
-> Index Scan using mult_ord_index02 on mult_ord (cost=0.00..7.67 rows=1 width=17)
Index Cond: (mult_ord.transno = credit.transno)

That's what I want, good. Now with enable_sort off this is the plan it chooses:

Group (cost=0.00..11149194.48 rows=1 width=9)
-> Nested Loop (cost=0.00..11149194.47 rows=1 width=9)
-> Index Scan using mult_ord_index01 on mult_ord (cost=0.00..442888.78 rows=9307812 width=17)
-> Index Scan using credit_index02 on credit (cost=0.00..1.14 rows=1 width=9)
Index Cond: ((credit.date >= '2009-02-16'::date) AND (credit.date <= '2009-02-16'::date) AND (credit.transno = mult_ord.transno))
Filter: (((credit.cancelled)::text = ' '::text) AND ((credit.show = 450000::numeric) OR (credit.show = 450001::numeric)))

With enable_sort off if I get rid of the distinct and swap the order by for a group by it picks a good plan, however once I stick the order by in there to try and sort it we go back to the plan above. Now I know to a degree the planner is really just doing what I've told it to do, but is there anything else I can tweek to try and get a ballance?

I've upped the statistics target from it's default of 10 to 100, which I think is probably a good idea anyway but it doesn't affect this quey plan.

Any ideas?

My initial testing was done on 8.2 and this , are there any factors that might mean I'm better off with enable_sort on in 8.3?

Regards
Glyn

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2009-02-18 14:36:56 Re: leak in libpq, getpwuid
Previous Message Michael Akinde 2009-02-18 13:27:41 Large object loading stalls