Quick Links

Re: posix advises ...

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	"Abhijit Menon-Sen" <ams(at)oryx(dot)com>
Cc:	"Zoltan Boszormenyi" <zb(at)cybertec(dot)at>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: posix advises ...
Date:	2008-07-11 23:52:42
Message-ID:	87abgo0yid.fsf@oxford.xeocode.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

"Abhijit Menon-Sen" <ams(at)oryx(dot)com> writes:

> Hi Zoltán.
>
> I was reading through your posix_fadvise patch,

Actually Zoltan's patch was based on an earlier patch from me. The sections
you're highlighting here are from my original patch.

> and I wanted to ask about this change in particular:
>
>> --- a/src/backend/executor/nodeIndexscan.c
>> +++ b/src/backend/executor/nodeIndexscan.c
>> @@ -290,7 +290,7 @@ ExecIndexEvalArrayKeys(ExprContext *econtext,
>> /* We want to keep the arrays in per-tuple memory */
>> oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
>>
>> - for (j = 0; j < numArrayKeys; j++)
>> + for (j = numArrayKeys-1; j >= 0; j--)

> Why is this loop reversed? (I could have missed some discussion about
> this, I just wanted to make sure it was intentional.)

There was some discussion about this change and in fact if you look at CVS
HEAD you'll find it already applied. It probably doesn't make a big difference
at least in most cases but it seems more sensible to increment the least
significant key elements first since that maximizes the chances of fetching
index keys from the same index pages or nearby pages. Incrementing the most
significant index keys would maximize the distance we're jumpin around in the
index tree.

> But I'm not convinced that this GUC is well-advised; at least, it needs
> some advice about how to determine a sensible size for the parameter
> (and maybe a better name). But is it really necessary at all?

I'm not sure which version of my patch Zoltan's was based on. The later
versions of mine had a GUC named effective_spindle_count which I think is
nicely abstracted away from the implementation details. We do need something
like that because we have no way to determine how wide a raid stripe Postgres
is sitting on and the optimal read-ahead depends on that.

The alternative would be to measure the bandwith with various amounts of
read-ahead. But I fear that would be quite hard on a production system with
the amount of noise from other i/o. It's hard enough on an idle machine with
synthetic benchmarks.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!

In response to

Re: posix advises ... at 2008-07-11 10:02:16 from Abhijit Menon-Sen

Responses

Re: posix advises ... at 2008-07-12 11:34:32 from Abhijit Menon-Sen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dave Page	2008-07-11 23:57:24	Re: Vacuuming leaked temp tables (once again)
Previous Message	David E. Wheeler	2008-07-11 23:49:10	Re: PATCH: CITEXT 2.0 v3

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Abhijit Menon-Sen	2008-07-12 11:30:41	Re: Extending grant insert on tables to sequences
Previous Message	Teodor Sigaev	2008-07-11 22:58:54	Re: [PATCHES] GIN improvements