Re: [PoC] pgstattuple2: block sampling to reduce physical read

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] pgstattuple2: block sampling to reduce physical read
Date: 2013-07-23 11:02:44
Message-ID: 51EE62D4.7020401@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/23/13 2:16 AM, Satoshi Nagayasu wrote:
> I've been working on new pgstattuple function to allow
> block sampling [1] in order to reduce block reads while
> scanning a table. A PoC patch is attached.

Take a look at all of the messages linked in
https://commitfest.postgresql.org/action/patch_view?id=778

Jaime and I tried to do what you're working on then, including a random
block sampling mechanism modeled on the stats_target mechanism. We
didn't do that as part of pgstattuple though, which was a mistake.

Noah created some test cases as part of his thorough review that were
not computing the correct results. Getting the results correct for all
of the various types of PostgreSQL tables and indexes ended up being
much harder than the sampling part. See
http://www.postgresql.org/message-id/20120222052747.GE8592@tornado.leadboat.com
in particular for that.

> This new function, pgstattuple2(), samples only 3,000 blocks
> (which accounts 24MB) from the table randomly, and estimates
> several parameters of the entire table.

There should be an input parameter to the function for how much sampling
to do, and if it's possible to make the scale for it to look like
ANALYZE that's helpful too.

I have a project for this summer that includes reviving this topic and
making sure it works on some real-world systems. If you want to work on
this too, I can easily combine that project into what you're doing.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Raiskup 2013-07-23 11:53:53 pg_upgrade across more than two neighboring major releases
Previous Message Markus Wanner 2013-07-23 11:00:29 Re: Proposal: template-ify (binary) extensions