Quick Links

Re: benchmarking the query planner

From:	Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To:	Gregory Stark <stark(at)enterprisedb(dot)com>
Cc:	Simon Riggs <simon(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: benchmarking the query planner
Date:	2008-12-12 19:34:42
Message-ID:	4942BCD2.9070306@cheapcomplexdevices.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Gregory Stark wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> The amount of I/O could stay the same, just sample all rows on block. [....]
>
> It will also introduce strange biases. For instance in a clustered table it'll
> think there are a lot more duplicates than there really are because it'll see
> lots of similar values.

But for ndistinct - it seems it could only help things. If the
ndistinct guesser just picks
max(the-current-one-row-per-block-guess,
a-guess-based-on-all-the-rows-on-the-blocks)
it seems we'd be no worse off for clustered tables; and much
better off for randomly organized tables.

In some ways I fear *not* sampling all rows on the block also
introduces strange biases by largely overlooking the fact that
the table's clustered.

In my tables clustered on zip-code we don't notice info like
"state='AZ' is present in well under 1% of blocks in the table",
while if we did scan all rows on the blocks it might guess this.
But I guess a histogram of blocks would be additional stat rather
than an improved one.

In response to

Re: benchmarking the query planner at 2008-12-12 17:05:58 from Gregory Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2008-12-12 19:44:16	Re: benchmarking the query planner
Previous Message	Tom Lane	2008-12-12 19:33:21	Re: PostgreSQL 8.3.4 reproducible crash