From: | Jeremy Harris <jgh(at)wizmail(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Minor performance improvement in transition to external sort |
Date: | 2014-02-25 19:55:08 |
Message-ID: | 530CF51C.1030208@wizmail.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 24/02/14 17:38, Robert Haas wrote:
> On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris <jgh(at)wizmail(dot)org> wrote:
>> Run under cachegrind, it takes about N/10 last-level cache misses,
>> all for the new item being introduced to the heap. The existing
>> code takes none at all.
>
> Can you explain this further? This seems like an important clue that
> could be useful when trying to optimize this code, but I'm a little
> unclear which part of the operation has more cache misses with your
> changes and why.
In the patched version, for the heapify operation the outer loop
starts at the last heap-parent tuple and works backward to the
start of the tuples array. A copy is taken of the SortTuple being
operated on for the inner loop to use. This copy suffers cache misses.
(The inner loop operates on elements between the copy source and the
end of the array).
In the original, the outer loop runs the array in increasing index
order. Again a copy is taken of the SortTuple for the inner loop
to use. This copy does not appear to take significant cache misses.
(The inner loop operates on elements between the copy source and
the start of the array).
--
Cheers,
Jeremy
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2014-02-25 20:03:36 | Re: Unfortunate choice of short switch name in pgbench |
Previous Message | Tom Lane | 2014-02-25 19:49:08 | Unfortunate choice of short switch name in pgbench |