<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Richard Huxton wrote:
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
<meta http-equiv="Content-Type" content="text/html; ">
<meta name="Generator" content="MS Exchange Server version 6.5.7226.0">
<title>Re: [PERFORM] Deceiding which index to use</title>
<!-- Converted from text/plain format -->
<p><font size="2">And does the planner know that?<br>
SELECT * FROM pg_stats WHERE tablename='subscriber' AND
attname='anumber';<br>
It's the n_distinct you're interested in, and perhaps most_common_freqs.</font><br>
</p>
</blockquote>
<small>n_distinct is -0.359322 and most_common_vals contains about 10
different anumbers (which are corretct), most_common_freqs are between
0.01 and 0.001. What does n_distinct exactly mean? Why is it negative?</small><br>
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
<p><font size="2">> Nothing, everything is the same - the problem
lies on the other table's index<br>
> usage, using this index is fine.<br>
<br>
The planner has to guess how many matches it will have for<br>
subscriber=5555555. Based on that choice, it will either:<br>
a. Do the join, then find the highest crd values (sort)<br>
b. Scan the crd values backwards and then join<br>
It's chosen (b) because it's estimating the numbers of matches<br>
incorrectly. I'm wondering whether the system can't see through your<br>
function-call (the cast to numeric) to determine how many matches it's<br>
going to get for any given value.<br>
</font></p>
</blockquote>
<small>It can see through the cast - I have just tried to create the
same database omitting the non-numeric anumbers and the results are the
same.</small><br>
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
<p><font size="2">If the system can't be persuaded into getting its
estimates more<br>
accurate, it might be worth trying an index on (subscriber_id,crd) and<br>
dropping the index on (crd) - if that's reasonable for your query
patterns.<br>
</font></p>
</blockquote>
<small>I'll try that one if the negative n_distinct value can be a
correct one :-)<br>
<br>
Zizi</small><br>
</body>
</html>