Re: Deceiding which index to use

From: Mezei Zoltán <mezei(dot)zoltan(at)telefor(dot)hu>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Deceiding which index to use
Date: 2007-03-09 15:13:47
Message-ID: 45F179AB.7040000@telefor.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Richard Huxton wrote:
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
<meta http-equiv="Content-Type" content="text/html; ">
<meta name="Generator" content="MS Exchange Server version 6.5.7226.0">
<title>Re: [PERFORM] Deceiding which index to use</title>
<!-- Converted from text/plain format -->
<p><font size="2">And does the planner know that?<br>
SELECT * FROM pg_stats WHERE tablename='subscriber' AND
attname='anumber';<br>
It's the n_distinct you're interested in, and perhaps most_common_freqs.</font><br>
</p>
</blockquote>
<small>n_distinct is -0.359322 and most_common_vals contains about 10
different anumbers (which are corretct), most_common_freqs are between
0.01 and 0.001. What does n_distinct exactly mean? Why is it negative?</small><br>
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
<p><font size="2">&gt; Nothing, everything is the same - the problem
lies on the other table's index<br>
&gt; usage, using this index is fine.<br>
<br>
The planner has to guess how many matches it will have for<br>
subscriber=5555555. Based on that choice, it will either:<br>
   a. Do the join, then find the highest crd values (sort)<br>
   b. Scan the crd values backwards and then join<br>
It's chosen (b) because it's estimating the numbers of matches<br>
incorrectly. I'm wondering whether the system can't see through your<br>
function-call (the cast to numeric) to determine how many matches it's<br>
going to get for any given value.<br>
</font></p>
</blockquote>
<small>It can see through the cast - I have just tried to create the
same database omitting the non-numeric anumbers and the results are the
same.</small><br>
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
<p><font size="2">If the system can't be persuaded into getting its
estimates more<br>
accurate, it might be worth trying an index on (subscriber_id,crd) and<br>
dropping the index on (crd) - if that's reasonable for your query
patterns.<br>
</font></p>
</blockquote>
<small>I'll try that one if the negative n_distinct value can be a
correct one :-)<br>
<br>
Zizi</small><br>
</body>
</html>

Attachment Content-Type Size
unknown_filename text/html 2.3 KB

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2007-03-09 15:21:20 Re: Deceiding which index to use
Previous Message Richard Huxton 2007-03-09 15:03:54 Re: Deceiding which index to use