From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore |
Date: | 2015-09-02 23:02:00 |
Message-ID: | CAM3SWZQh1-jGa+OpM5WgZxFO_D4KNngMtpTyKv-HWuNGwCG8UQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Sep 2, 2015 at 3:13 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> I'd be less brief in this case then, no need to be super short here.
Okay.
>> It started out that way, but Tom felt that it was better to have a
>> USE_MEM_PREFETCH because of the branch below...
>
> That doesn't mean we shouldn't still provide an empty definition.
Okay.
> That's just a question of how to formulate this though?
>
> pg_rfetch(((char *) state->memtuples ) + 3 * sizeof(SortTuple) + offsetof(SortTuple, tuple))?
>
> For something heavily platform dependent like this that seems ok.
Well, still needs to work for tuplestore, which does not have a SortTuple.
>> Because that was the fastest value following testing on my laptop. You
>> are absolutely right to point out that this isn't a good reason to go
>> with the patch -- I share your concern. All I can say in defense of
>> that is that other major system software does the same, without any
>> regard for the underlying microarchitecture AFAICT.
>
> I know linux stripped out most prefetches at some point, even from x86
> specific code, because it showed that they aged very badly. I.e. they
> removed a bunch of them and stuff got faster, whereas they were
> beneficial on earlier architectures.
That is true, but IIRC that was specifically in relation to a commonly
used list data structure that had prefetches all over the place. That
was a pretty bad idea.
I think that explicit prefetching has extremely limited uses, too. The
only cases that I can imagine being helped are cases where there is
extremely predictable sequential access, but some pointer indirection.
Because of the way tuples are fetched across translation unit
boundaries in the cases addressed by the patch, it isn't hard to see
why the compiler does not do this automatically (prefetch instructions
added by the compiler are not common anyway, IIRC). The compiler has
no way of knowing that gettuple_common() is ultimately called from an
important inner loop, which could make all the difference, I suppose.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2015-09-02 23:12:29 | Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore |
Previous Message | Bruce Momjian | 2015-09-02 22:56:29 | Re: Horizontal scalability/sharding |