From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: cost_sort() may need to be updated |
Date: | 2016-09-11 16:13:08 |
Message-ID: | CAM3SWZTD6mXXf+zcLA_CsLnV2yRCVrntE0p0mTMeNVgkGvURNw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Sep 11, 2016 at 9:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> I think that we *can* refine this guess, and should, because random
>> I/O is really quite unlikely to be a large cost these days (I/O in
>> general often isn't a large cost, actually). More fundamentally, I
>> think it's a problem that cost_sort() thinks that external sorts are
>> far more expensive than internal sorts in general. There is good
>> reason to think that that does not reflect the reality. I think we can
>> expect external sorts to be *faster* than internal sorts with
>> increasing regularity in Postgres 10.
>
> TBH, if that's true, haven't you broken something?
It's possible for external sorts to be faster some of the time because
the memory access patterns can be more cache efficient: smaller runs
are better when accessing tuples in sorted order, scattered across
memory. More importantly, the sort can start returning tuples earlier
in the common case where a final on-the-fly merge can be performed. In
principle, you could adopt internal sorts to have the same advantages,
but that hasn't and probably won't happen. Finally, the external sort
I/O costs grow linearly, whereas the CPU costs grow in a linearithmic
fashion, which will eventually come to dominate. We can hide the
latency of those costs pretty well, too, with asynchronous I/O.
I'm not arguing that cost_sort() should think that external sorts are
cheaper under any circumstances, since all of this is very hard to
model. I only mention this because it illustrates nicely that
cost_sort() has the wrong idea.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-09-11 16:20:32 | Re: Merge Join with an Index Optimization |
Previous Message | Tom Lane | 2016-09-11 16:01:22 | Re: cost_sort() may need to be updated |