Quick Links

Re: Hash Index Build Patch

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Tom Raney <twraney(at)comcast(dot)net>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: Hash Index Build Patch
Date:	2007-09-26 20:06:28
Message-ID:	27185.1190837188@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-patches

Tom Raney <twraney(at)comcast(dot)net> writes:
> Alvaro Herrera wrote:
>> Just wondering, wouldn't it be enough to obtain a tuple count estimate
>> by using reltuples / relpages * RelationGetNumberOfBlocks, like the
>> planner does?

> We thought of that and the verdict is still out whether it is more
> costly to scan the entire relation to get the accurate count or use the
> estimate and hope for the best with the possibility of splits occurring
> during the build. If we use the estimate and it is completely wrong
> (with the actual tuple count being much higher) the sort will provide no
> benefit and it will behave as did the original code.

I think this argument is *far* too weak to justify an extra pass over
the relation. The planner-style calculation is quite unlikely to give a
major underestimate of the rowcount. It might overestimate, eg if the
relation is bloated by dead tuples, but an error in that direction won't
kill you.

regards, tom lane

In response to

Re: Hash Index Build Patch at 2007-09-26 19:43:03 from Tom Raney

Responses

Re: Hash Index Build Patch at 2007-09-27 08:09:17 from Simon Riggs

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Tom Lane	2007-09-26 22:38:24	Re: Minor recovery changes
Previous Message	Tom Raney	2007-09-26 19:43:03	Re: Hash Index Build Patch