From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Delaying the planning of unnamed statements until Bind |
Date: | 2004-05-30 20:27:19 |
Message-ID: | Pine.OSF.4.58.0405291601540.450948@kosh.hut.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, 22 May 2004, Greg Stark wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > I think it's wishful thinking to assume that picking an indexscan is the
> > right thing when we don't know any better.
>
> If we don't know any better then any solution is going to be wishful thinking.
> It's kind of like a bridge game. If you don't know where the card is but you
> need it somewhere in order to win, then you have to assume it's there and hope
> for the best. If the index is wrong then the query was going to be slow either
> way. If the index was right and you chose not to use it you're risking making
> it slow unnecessarily and potentially when it was absoluetely required to be
> fast.
Could we try to plan for both cases? How about calculating the cutoff
point where the seqscan becomes faster than indexscan, creating a plan
with both paths, and picking which path to take at execute time?
More generally, keep *all* paths that make sense with *some* combination
of parameter values, and determine rules on which path to use with which
parameter values.
For example, if the query plan looks currently like this:
template1=# prepare b (int) AS SELECT * FROM foo WHERE bar > $1 ORDER BY bar*10;
PREPARE
template1=# explain EXECUTE b (2);
QUERY PLAN
-------------------------------------------------------------------------
Sort (cost=19.71..20.28 rows=230 width=4)
Sort Key: (bar * 10)
-> Index Scan using fooo on foo (cost=0.00..10.69 rows=230 width=4)
Index Cond: (bar > $1)
(4 rows)
It would become something like this:
template1=# prepare b (int) AS SELECT * FROM foo WHERE bar > $1 ORDER BY bar*10;
PREPARE
template1=# explain EXECUTE b (2);
QUERY PLAN
-------------------------------------------------------------------------
Sort (cost=19.71..20.28 rows=230 width=4)
Sort Key: (bar * 10)
if $1 > 400 then
-> Index Scan using fooo on foo (cost=0.00..10.69 rows=230 width=4)
Index Cond: (bar > $1)
else
-> Seq Scan on foo (cost=0.00..14.17 rows=629 width=4)
Filter: (bar > 100)
This means that execute stage would look at $1, and choose the seq scan if
it's > 400, otherwise use the seq scan.
I haven't looked at the planner code, I don't know how hard it would be to
implement, but I think it's something to consider.
Until we figure how to do the above, I think the plan-on-execute mode
would be very handy. However, it should not be on by default, IMHO.
I'm worried about plan stability if we enable it by default. It could
lead to nasty, hard to reproduce performance problems. Think about this
scenario:
A long-running server application prepares the query "SELECT * FROM foo
WHERE timestamp > $1". 99% of the transactions that use the prepared
query are online transactions that need to be very quick. They use
parameter values very close to now(), and should do an indexscan. The
rest are reports, running the same query with a parameter value of now() -
1 month. The reports should optimally use seqscan, but the slowness
of indexscan is acceptable too.
The application goes down every night for maintenance purposes, and is
restarted in the morning.
If the first transaction in the morning happens to be a report, all the
following online transactions will use a seqscan, and become veeery
slow. The operator gets angry phone calls, and reboots the system.
Everything starts to work ok.
Also keep in mind that at least some application servers have a
client-side prepared statement cache, so that even if the application
used a non-prepared statement for the reports, the middleware prepares it
anyway.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Darko Prenosil | 2004-05-30 20:44:14 | Re: yet another contrib module |
Previous Message | pgsql | 2004-05-30 17:15:44 | Re: Converting postgresql.conf parameters to kilobytes |