Re: Performance problem with low correlation data

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: m_lists(at)yahoo(dot)it
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance problem with low correlation data
Date: 2009-07-09 17:36:45
Message-ID: 20090709173645.GK6414@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

m_lists(at)yahoo(dot)it wrote:

> testinsert contains t values between '2009-08-01' and '2009-08-09', and ne_id from 1 to 20000. But only 800 out of 20000 ne_id have to be read; there's no need for a table scan!
> I guess this is a reflection of the poor "correlation" on ne_id; but, as I said, I don't really think ne_id is so bad correlated.
> In fact, doing a "select ne_id, t from testinsert limit 100000"  I can see that data is laid out pretty much by "ne_id, t", grouped by day (that is, same ne_id for one day, then next ne_id and so on until next day).
> How is the "correlation" calculated? Can someone explain to me why, after the procedure above,correlation is so low???

Did you run ANALYZE after the procedure above?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bill Moran 2009-07-09 17:40:27 Re: is autovacuum recommended?
Previous Message Andres Freund 2009-07-09 17:36:39 Re: is autovacuum recommended?