From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore |
Date: | 2015-09-02 23:12:29 |
Message-ID: | 20150902231229.GF8555@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2015-09-02 16:02:00 -0700, Peter Geoghegan wrote:
> On Wed, Sep 2, 2015 at 3:13 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > That's just a question of how to formulate this though?
> >
> > pg_rfetch(((char *) state->memtuples ) + 3 * sizeof(SortTuple) + offsetof(SortTuple, tuple))?
> >
> > For something heavily platform dependent like this that seems ok.
>
> Well, still needs to work for tuplestore, which does not have a SortTuple.
Isn't it even more trivial there? It's just an array of void*'s? So
prefetch(state->memtuples + 3 + readptr->current)?
> Because of the way tuples are fetched across translation unit
> boundaries in the cases addressed by the patch, it isn't hard to see
> why the compiler does not do this automatically (prefetch instructions
> added by the compiler are not common anyway, IIRC).
Hardware prefetchers just have gotten to be rather good and obliterated
most of the cases where it's beneficial.
I'd be interested to see a perf stat -ddd comparison to the patch
with/without prefetches. It'll be interesting to see how the number of
cache hits/misses and prefetches changes.
Which microarchitecture did you test this on?
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2015-09-02 23:14:43 | Re: Horizontal scalability/sharding |
Previous Message | Peter Geoghegan | 2015-09-02 23:02:00 | Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore |