Quick Links

Re: Performance problem with low correlation data

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	m_lists(at)yahoo(dot)it
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Performance problem with low correlation data
Date:	2009-07-09 17:36:45
Message-ID:	20090709173645.GK6414@alvh.no-ip.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

m_lists(at)yahoo(dot)it wrote:

> testinsert contains t values between '2009-08-01' and '2009-08-09', and ne_id from 1 to 20000. But only 800 out of 20000 ne_id have to be read; there's no need for a table scan!
> I guess this is a reflection of the poor "correlation" on ne_id; but, as I said, I don't really think ne_id is so bad correlated.
> In fact, doing a "select ne_id, t from testinsert limit 100000" I can see that data is laid out pretty much by "ne_id, t", grouped by day (that is, same ne_id for one day, then next ne_id and so on until next day).
> How is the "correlation" calculated? Can someone explain to me why, after the procedure above,correlation is so low???

Did you run ANALYZE after the procedure above?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Re: Performance problem with low correlation data at 2009-07-09 07:38:56 from m_lists

Browse pgsql-general by date

	From	Date	Subject
Next Message	Bill Moran	2009-07-09 17:40:27	Re: is autovacuum recommended?
Previous Message	Andres Freund	2009-07-09 17:36:39	Re: is autovacuum recommended?