Quick Links

Re: Improving N-Distinct estimation by ANALYZE

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Greg Stark <gsstark(at)MIT(dot)EDU>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Improving N-Distinct estimation by ANALYZE
Date:	2006-01-10 22:00:02
Message-ID:	87hd8bzqu5.fsf@stark.xeocode.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:

> On Mon, 2006-01-09 at 22:08 -0500, Greg Stark wrote:
>
> > So it's not the 8k block reading that's fooling Linux into reading ahead 32k.
> > It seems 32k readahead is the default for Linux, or perhaps it's the
> > sequential access pattern that's triggering it.
>
> Nah, Linux 2.6 uses flexible readahead logic. It increases slowly when
> you read sequentially, but halves the readahead if you do another access
> type. Can't see that would give an average readahead size of 32k.

I've actually read this code at one point in the past. IIRC the readahead is
capped at 32k, which I find interesting given the results. Since this is
testing sequential access patterns perhaps what's happening is the test for
readahead is too liberal.

All the numbers I'm getting are consistent with a 32k readahead. Even I run my
program with a 4k block size I get performance equivalent to a full table scan
very quickly. If I use a 32k block size then the breakeven point is just over
20%.

I suppose what I really ought to do is make some pretty graphs.

--
greg

In response to

Re: Improving N-Distinct estimation by ANALYZE at 2006-01-10 20:34:18 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	tmorelli	2006-01-10 22:52:52	A question about index internals
Previous Message	Josh Berkus	2006-01-10 21:14:31	Re: Question about Postgresql time fields(possible bug)