From: | Tiago Antão <tra(at)fct(dot)unl(dot)pt> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jules Bean <jules(at)jellybean(dot)co(dot)uk>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Optimisation deficiency: currval('seq')-->seq scan, constant-->index scan |
Date: | 2000-08-23 15:03:42 |
Message-ID: | Pine.LNX.4.21.0008231543340.4273-100000@eros.si.fct.unl.pt |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
On Wed, 23 Aug 2000, Tom Lane wrote:
> Yes, we know about that one. We have stats about the most common value
> in a column, but no information about how the less-common values are
> distributed. We definitely need stats about several top values not just
> one, because this phenomenon of a badly skewed distribution is pretty
> common.
An end-biased histogram has stats on top values and also on the least
frequent values. So if a there is a selection on a value that is well
bellow average, the selectivity estimation will be more acurate. On some
research papers I've read, it's refered that this is a better approach
than equi-width histograms (which are said to be the "industry" standard).
I not sure whether to use a table or a array attribute on pg_stat for
the histogram, the problem is what could be expected from the size of the
attribute (being a text). I'm very affraid of the cost of going through
several tuples on a table (pg_histogram?) during the optimization phase.
One other idea would be to only have better statistics for special
attributes requested by the user... something like "analyze special
table(column)".
Best Regards,
Tiago
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2000-08-23 15:20:28 | Re: Problem with insert |
Previous Message | Tom Lane | 2000-08-23 14:46:28 | Re: analyze.c |