Re: Using quicksort for every external sort run

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using quicksort for every external sort run
Date: 2016-03-24 03:28:31
Message-ID: CAM3SWZTHhoTS5NjfH1bvigaU1WxhN3kT5BUB=XUXeASXc0+bUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 23, 2016 at 8:05 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> FWIW, maintenance_work_mem was set to 1GB on the i5 machine and 256MB on the
> Xeon. Hmm, maybe that's why we see no difference for CREATE INDEX on the i5,
> and an improvement on the Xeon.

That would explain it.

> Not a big deal - it's easy enough to change the config and repeat the
> benchmark. Are there any particular replacement_sort_mem values that you
> think would be interesting to configure?

I would start with replacement_sort_mem=64. i.e., 64KB, effectively disabled

> I have to admit I'm a bit afraid we'll introduce a new GUC that only very
> few users will know how to set properly, and so most people will run with
> the default value or set it to something stupid.

I agree.

> Are you saying none of the queries triggers the memory context resets? What
> queries would trigger that (to test the theory)?

They will still do the context resetting and so on just the same, but
would use a heap for the first attempt. But replacement_sort_mem=64
would let us know that

> But if we care about 9.5 -> 9.6 regressions, then perhaps we should include
> that commit into the benchmark, because that's what the users will see? Or
> have I misunderstood the second part?

I think it's good that you didn't test the March 17 commit of the
memory batching to the master branch when testing the master branch.
You should continue to do that, because we care about regressions
against 9.5 only. The only issue insofar as what code was tested is
that replacement_sort_mem was not set to 64 (to effectively disable
any use of the heap by the patch). I would like to see if we can get
rid of replacement_sort_mem without causing any real regressions,
which I think the memory context reset stuff makes possible.

There was a new version of my quicksort patch posted after March 17,
but don't worry about it -- that's totally cosmetic. Some minor
tweaks.

> BTW which patch does the memory batching? A quick search through git log did
> not return any recent patches mentioning these terms.

Commit 0011c0091e886b874e485a46ff2c94222ffbf550. But, like I said,
avoid changing what you're testing as master; do not include that. The
patch set you were testing is fine. Nothing is missing.

> As I mentioned above, I haven't realized work_mem does not matter for CREATE
> INDEX, and maintenance_work_mem was set to a fixed value for the whole test.
> And the two machines used different values for this particular configuration
> value - Xeon used just 256MB, while i5 used 1GB. So while on i5 it was just
> a single chunk, on Xeon there were multiple batches. Hence the different
> behavior.

Makes sense. Obviously this should be avoided, though.

> No it doesn't add overhead. The script actually does
>
> COPY (query) TO '/dev/null'
>
> on the server for all queries (except for the CREATE INDEX, obviously), so
> there should be pretty much no overhead due to transferring rows to the
> client and so on.

That still adds overhead, because the output functions are still used
to create a textual representation of data. This was how Andres tested
the improvement to the timestamptz output function committed to 9.6,
for example.

>> If you really wanted to make the patch look good, a sort with 5GB of
>> work_mem is the best way, FWIW. The heap data structure used by the
>> master branch tuplesort.c will handle that very badly. You use no
>> temp_tablespaces here. I wonder if the patch would do better with
>> that. Sorting can actually be quite I/O bound with the patch
>> sometimes, where it's usually CPU/memory bound with the heap,
>> especially with lots of work_mem. More importantly, it would be more
>> informative if the temp_tablespace was not affected by I/O from
>> Postgres' heap.
>
>
> I'll consider testing that. However, I don't think there was any significant
> I/O on the machines - particularly not on the Xeon, which has 16GB of RAM.
> So the temp files should fit into that quite easily.

Right, but with a bigger sort, there might well be more I/O.
Especially for the merge. It might be that that holds back the patch
from doing even better than the master branch does.

> The i5 machine has only 8GB of RAM, but it has 6 SSD drives in raid0. So I
> doubt it was I/O bound.

These patches can sometimes be significantly I/O bound on my laptop,
where that didn't happen before. Sounds unlikely here, though.

>> I also like seeing a sample of "trace_sort = on" output. I don't
>> expect you to carefully collect that in every case, but it can tell
>> us a lot about what's really going on when benchmarking.
>
>
> Sure, I can collect that.

Just for the interesting cases. Or maybe just dump it all and let me
figure it out for myself. trace_sort output shows me how many runs
they are, how abbreviation did, how memory was used, and even if the
sort was I/O bound at various stages (it dumps some getrusage stats to
the log, too). You can usually tell exactly what happened for external
sorts, which is very interesting for those one or two cases that you
found to be noticeably worse off with the patch.

Thanks for testing!
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-03-24 03:54:02 Re: Rationalizing code-sharing among src/bin/ directories
Previous Message Tomas Vondra 2016-03-24 03:05:54 Re: Using quicksort for every external sort run