From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Jorge Montero" <jorge_montero(at)homedecorators(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Hashaggregate estimates |
Date: | 2010-01-13 22:08:15 |
Message-ID: | 5783.1263420495@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
"Jorge Montero" <jorge_montero(at)homedecorators(dot)com> writes:
> The killer seems to be the row aggregation. There are about 95K
> different values of sku_id in the sales table, and even the best
> seller items are a very small percentage of all rows, so expecting the
> aggregation to consolidate the rows 50:1 like it does in one of the
> explains above is a pipe dream. I've increased statistics in sku_id
> into the three digits, but results are not any better
Yeah, estimating the number of distinct values from a sample of the data
is a hard problem :-(.
> Is there any way I can coax Postgres into making a more realistic
> aggregation estimate?
There's no good way in 8.3. (In CVS HEAD there's a feature to manually
override the ndistinct estimate for a column.) In principle you could
manually update the pg_statistic.stadistinct value for the column, but
the trouble with that is the next ANALYZE will overwrite it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bob Dusek | 2010-01-13 22:37:56 | Re: performance config help |
Previous Message | Jorge Montero | 2010-01-13 21:58:45 | Hashaggregate estimates |