Quick Links

Re: Deceiding which index to use

From:	Mezei Zoltán <mezei(dot)zoltan(at)telefor(dot)hu>
To:	Richard Huxton <dev(at)archonet(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Deceiding which index to use
Date:	2007-03-09 15:13:47
Message-ID:	45F179AB.7040000@telefor.hu
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Richard Huxton wrote:
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
<meta http-equiv="Content-Type" content="text/html; ">
<meta name="Generator" content="MS Exchange Server version 6.5.7226.0">
<title>Re: [PERFORM] Deceiding which index to use</title>

And does the planner know that? 
SELECT * FROM pg_stats WHERE tablename='subscriber' AND
attname='anumber'; 
It's the n_distinct you're interested in, and perhaps most_common_freqs. 

</blockquote>
n_distinct is -0.359322 and most_common_vals contains about 10
different anumbers (which are corretct), most_common_freqs are between
0.01 and 0.001. What does n_distinct exactly mean? Why is it negative? 
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
> Nothing, everything is the same - the problem
lies on the other table's index 
> usage, using this index is fine. 
 
The planner has to guess how many matches it will have for 
subscriber=5555555. Based on that choice, it will either: 
a. Do the join, then find the highest crd values (sort) 
b. Scan the crd values backwards and then join 
It's chosen (b) because it's estimating the numbers of matches 
incorrectly. I'm wondering whether the system can't see through your 
function-call (the cast to numeric) to determine how many matches it's 
going to get for any given value. 

</blockquote>
It can see through the cast - I have just tried to create the
same database omitting the non-numeric anumbers and the results are the
same. 
<blockquote cite="mid:45F1775A(dot)8030701(at)archonet(dot)com" type="cite">
If the system can't be persuaded into getting its
estimates more 
accurate, it might be worth trying an index on (subscriber_id,crd) and 
dropping the index on (crd) - if that's reasonable for your query
patterns. 

</blockquote>
I'll try that one if the negative n_distinct value can be a
correct one :-) 
 
Zizi 
</body>
</html>

Attachment	Content-Type	Size
unknown_filename	text/html	2.3 KB

In response to

Re: Deceiding which index to use at 2007-03-09 15:03:54 from Richard Huxton

Responses

Re: Deceiding which index to use at 2007-03-09 15:21:20 from Richard Huxton

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Richard Huxton	2007-03-09 15:21:20	Re: Deceiding which index to use
Previous Message	Richard Huxton	2007-03-09 15:03:54	Re: Deceiding which index to use