From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tom Raney <twraney(at)comcast(dot)net> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-patches(at)postgresql(dot)org |
Subject: | Re: Hash Index Build Patch |
Date: | 2007-09-26 20:06:28 |
Message-ID: | 27185.1190837188@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Tom Raney <twraney(at)comcast(dot)net> writes:
> Alvaro Herrera wrote:
>> Just wondering, wouldn't it be enough to obtain a tuple count estimate
>> by using reltuples / relpages * RelationGetNumberOfBlocks, like the
>> planner does?
> We thought of that and the verdict is still out whether it is more
> costly to scan the entire relation to get the accurate count or use the
> estimate and hope for the best with the possibility of splits occurring
> during the build. If we use the estimate and it is completely wrong
> (with the actual tuple count being much higher) the sort will provide no
> benefit and it will behave as did the original code.
I think this argument is *far* too weak to justify an extra pass over
the relation. The planner-style calculation is quite unlikely to give a
major underestimate of the rowcount. It might overestimate, eg if the
relation is bloated by dead tuples, but an error in that direction won't
kill you.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-09-26 22:38:24 | Re: Minor recovery changes |
Previous Message | Tom Raney | 2007-09-26 19:43:03 | Re: Hash Index Build Patch |