From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Arthur Ward" <award(at)dominionsciences(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Union+group by planner estimates way off? |
Date: | 2003-11-13 18:46:51 |
Message-ID: | 5806.1068749211@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
"Arthur Ward" <award(at)dominionsciences(dot)com> writes:
> EXPLAIN ANALYZE SELECT id FROM
> (SELECT id, commodity FROM commodities WHERE commodity IS NOT NULL
> UNION
> SELECT id, fak FROM commodities WHERE fak IS NOT NULL
> ) all_commodities GROUP BY id;
> QUERY
> PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=15939.16..15939.16 rows=200 width=4) (actual
> time=3537.281..3680.418 rows=83306 loops=1)
> -> Subquery Scan all_commodities (cost=14002.00..15697.02 rows=96858
> width=4) (actual time=2268.052..3214.996 rows=95715 loops=1)
It's falling back to a default estimate because it doesn't know how to
find any statistics for the output of a sub-select. I have a TODO
somewhere about burrowing down into sub-selects to see if the output maps
directly to a column that we'd have stats for ... but it's not done yet.
In this particular case the inaccurate estimate doesn't matter too much,
I think, although it might be encouraging the system to select hash
aggregation since it thinks the hashtable will be pretty small. If the
estimate were getting used to plan higher-up plan steps then it could
be a bigger problem.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2003-11-13 18:56:26 | Re: strange estimate for number of rows |
Previous Message | Arthur Ward | 2003-11-13 18:28:15 | Union+group by planner estimates way off? |