From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | "Abhijit Menon-Sen" <ams(at)oryx(dot)com> |
Cc: | "Zoltan Boszormenyi" <zb(at)cybertec(dot)at>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: posix advises ... |
Date: | 2008-07-11 23:52:42 |
Message-ID: | 87abgo0yid.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
"Abhijit Menon-Sen" <ams(at)oryx(dot)com> writes:
> Hi Zoltán.
>
> I was reading through your posix_fadvise patch,
Actually Zoltan's patch was based on an earlier patch from me. The sections
you're highlighting here are from my original patch.
> and I wanted to ask about this change in particular:
>
>> --- a/src/backend/executor/nodeIndexscan.c
>> +++ b/src/backend/executor/nodeIndexscan.c
>> @@ -290,7 +290,7 @@ ExecIndexEvalArrayKeys(ExprContext *econtext,
>> /* We want to keep the arrays in per-tuple memory */
>> oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
>>
>> - for (j = 0; j < numArrayKeys; j++)
>> + for (j = numArrayKeys-1; j >= 0; j--)
> Why is this loop reversed? (I could have missed some discussion about
> this, I just wanted to make sure it was intentional.)
There was some discussion about this change and in fact if you look at CVS
HEAD you'll find it already applied. It probably doesn't make a big difference
at least in most cases but it seems more sensible to increment the least
significant key elements first since that maximizes the chances of fetching
index keys from the same index pages or nearby pages. Incrementing the most
significant index keys would maximize the distance we're jumpin around in the
index tree.
> But I'm not convinced that this GUC is well-advised; at least, it needs
> some advice about how to determine a sensible size for the parameter
> (and maybe a better name). But is it really necessary at all?
I'm not sure which version of my patch Zoltan's was based on. The later
versions of mine had a GUC named effective_spindle_count which I think is
nicely abstracted away from the implementation details. We do need something
like that because we have no way to determine how wide a raid stripe Postgres
is sitting on and the optimal read-ahead depends on that.
The alternative would be to measure the bandwith with various amounts of
read-ahead. But I fear that would be quite hard on a production system with
the amount of noise from other i/o. It's hard enough on an idle machine with
synthetic benchmarks.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Page | 2008-07-11 23:57:24 | Re: Vacuuming leaked temp tables (once again) |
Previous Message | David E. Wheeler | 2008-07-11 23:49:10 | Re: PATCH: CITEXT 2.0 v3 |
From | Date | Subject | |
---|---|---|---|
Next Message | Abhijit Menon-Sen | 2008-07-12 11:30:41 | Re: Extending grant insert on tables to sequences |
Previous Message | Teodor Sigaev | 2008-07-11 22:58:54 | Re: [PATCHES] GIN improvements |