From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: strange explain analyze output |
Date: | 2008-08-28 18:08:18 |
Message-ID: | 25834.1219946898@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Thu, 2008-08-28 at 00:42 -0400, Tom Lane wrote:
>> The reason that these statements are not inconsistent is that the
>> Sort is the inner relation for a mergejoin. In the presence of
>> duplicate keys in the outer relation, a mergejoin will "rewind" and
>> rescan duplicate keys in the inner relation; ...
> Then wouldn't the planner have estimated more rows returned by the sort
> (including rescanned rows) than the HashAgg? It estimated exactly the
> same number as it estimated for the output of the HashAgg.
No, the planner's numbers are correct for its purposes --- what it wants
to know is the total footprint of each sub-relation, so as to estimate
for instance the amount of space that'd be needed to hash or sort it.
(It does in fact internally make an estimate of the number of repeatedly
fetched rows, but this isn't reflected in the EXPLAIN output.)
If anything this discrepancy is an implementation flaw in EXPLAIN
ANALYZE: what it's measuring is not the number of tuples in the
sub-relation, but the number of times a tuple is returned. There are
some other cases, such as underneath a LIMIT, where what EXPLAIN ANALYZE
will report is quite at variance with what the planner reports.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bill | 2008-08-28 18:10:47 | SQL optimization - WHERE SomeField STARTING WITH ... |
Previous Message | Alan Hodgson | 2008-08-28 18:02:49 | Re: WAL file questions - how to relocate on Windows, how to replay after total loss, etc |