Quick Links

Re: Deceiding which index to use

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Mezei Zoltán <mezei(dot)zoltan(at)telefor(dot)hu>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Deceiding which index to use
Date:	2007-03-09 15:21:20
Message-ID:	45F17B70.60001@archonet.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Mezei Zoltán wrote:
> Richard Huxton wrote:
>>
>> And does the planner know that?
>> SELECT * FROM pg_stats WHERE tablename='subscriber' AND attname='anumber';
>> It's the n_distinct you're interested in, and perhaps most_common_freqs.
>>
> n_distinct is -0.359322 and most_common_vals contains about 10 different
> anumbers (which are corretct), most_common_freqs are between 0.01 and 0.001.
> What does n_distinct exactly mean? Why is it negative?

It's saying that it's a ratio, so if you doubled the number of
subscribers it would expect that the number of unique anumber's would
double too. So you've got about 36% of the rows with unique values -
pretty much what you said earlier. That's not bad, since the planner
only uses an estimate.

OK - so the next place to look is the distribution of values for
subscriber_id on the output_message_log. Does that have some subscribers
with many rows and lots with hardly any? If so, you might need to
increase the stats on that column:

ALTER TABLE output_message_log ALTER COLUMN subscriber_id SET STATISTICS
<num>;
ANALYSE output_message_log (subscriber_id);

The <num> defaults to 10, but can be set as high as 1000. You want to
try and capture the "big" subscribers.

--
Richard Huxton
Archonet Ltd

In response to

Re: Deceiding which index to use at 2007-03-09 15:13:47 from Mezei Zoltán

Responses

Re: Deceiding which index to use at 2007-03-09 15:44:18 from Mezei Zoltán

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Mezei Zoltán	2007-03-09 15:44:18	Re: Deceiding which index to use
Previous Message	Mezei Zoltán	2007-03-09 15:13:47	Re: Deceiding which index to use