Re: Weird indices

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Martijn van Oosterhout <kleptog(at)cupid(dot)suninternet(dot)com>
Cc: Joseph Shraibman <jks(at)selectacast(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: Weird indices
Date: 2001-02-21 17:38:09
Message-ID: Pine.BSF.4.21.0102210915570.8237-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> Well, who said about keeping track of changes? If the table is large chances
> are that this value would change very quickly. If the table is small it
> doesn't matter. It just seems to me to be the best way make clustering work
> better.

Yes, it probably is. I have some concerns about when it's wrong, but if
you're doing that many changes you probably need to run vacuum analyze
again anyway.

> > As far as I know the only way to change the fraction is through
> > recompiling but Tom would probably know better about that, unfortunately
> > that's a really big stick to hit the problem with.
>
> I realize that keeping better statistics is the best solution. However, not
> all types data can have a standard deviation since you need some form of
> order and that is not obvious in many cases...

True, but the same thing is pretty much true for a btree index. Maybe
in those cases, you just want to keep those kind of statistics on the
frequencies themselves. Since you can't really determine if something
is more likely to be high by its value (unless it's the most common),
you can try to keep info about where the most common frequency is and how
dispersed the frequencies are.

The big stick wasn't against the doing of it, just that there might exist
some tables where the current estimate is closer and you can't easily
change that per-table, except...
One thing that might be interesting is to see what it does if you tried
changing stacommonfraq in pg_statistic for that relation after a vacuum
analyze. That should change how many rows it thinks the most common value
has. I'm not sure of any side effects, but it seems to immediately change
my row estimates from explain. If you set it high enough that you still
get a sequence scan for the most common, but low enough that the others
given index scan, you might be okay.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2001-02-21 17:52:38 Re: Re: Deferred NOT NULL?
Previous Message Peter Eisentraut 2001-02-21 17:30:27 Re: ELOG_TIMESTAMPS feature