Re: Planner choice & tuning

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Planner choice & tuning
Date: 2004-06-22 23:38:16
Message-ID: 24392.1087947496@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Ed L." <pgsql(at)bluepolka(dot)net> writes:
> This seems to be saying the planner thinks its less expensive to do the
> sequential scan, but why?

Because it thinks it's less expensive ;-)

There are a couple issues here: one is overestimation of the number of
rows matching the query. That is a statistical issue and should be
fixable by increasing the statistics target for the column. With an
accurate rows estimate the seqscan cost estimate would not change but
the indexscan cost estimate would decrease approximately proportionally.

Allowing for the row estimation error, the indexscan cost estimate is
still about 4x what it ought to be, which means that you could bring the
estimated cost (for this query anyway) in line with reality by setting
random_page_cost = 1.

Note however that we are defining "reality" as "the results of this one
single experiment". You should realize in particular that the table is
probably fully cached in memory during your test, which is a scenario
in which random_page_cost actually is 1.0 (given reasonable assumptions
about the behavior of the kernel's cache, anyway). If you optimize for
this case then you are likely to be pessimizing the behavior for larger
tables that don't fit in memory.

My suggestion would be to lower random_page_cost to 3 or so, which would
be enough to tip the decision to indexscan for this case, though not to
make the estimated cost really truly correct. You should however run a
larger set of experiments before doing anything, and realize that any
random_page_cost setting is a compromise because the model doesn't take
all the variables into account.

You can find much more about this issue in the pgsql-performance archives.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Madison Kelly 2004-06-22 23:42:33 Re: More psql problems... >.<
Previous Message Scott Marlowe 2004-06-22 23:30:16 Re: More psql problems... >.<