Re: Better estimates of index correlation

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Better estimates of index correlation
Date: 2011-03-15 01:36:03
Message-ID: 4D7EC283.9060708@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> I don't understand, are they going years between vacuums because their
> data is static? In which case the index correlation won't change. Or
> is it append-only, in which case I suspect the newly appended data is
> likely to have the same correlation as the old data.

Append-only. And yes, one could assume that correlation wouldn't change
frequently. However, it may change more frequently than vacuums occur
-- I'm not exaggerating about "years". I have several clients with
large databases where they have log tables which only get vacuumed for
XID wraparound, once every 2 years or so.

There's also the question of how we get correlation stats for a new
index/table, or one which has just been upgraded. Requiring a full DB
vacuum isn't practical for those using pg_upgrade.

> But is there
> anything stopping us from doing some sort of ANALYZE-style sample of
> the index pages as well?

This would be ideal. Or even a separate command to scan the indexes
only to collect correlation data. Since the indexes are 20X to 100X
smaller than the tables (usually), it may be practical to full-scan them
even if we can't do the same with the table.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-03-15 02:26:31 Patch to git_changelog for release note creation
Previous Message Greg Stark 2011-03-15 01:10:05 Re: Better estimates of index correlation