From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Oleg Golovanov <rentech(at)mail(dot)ru> |
Subject: | Re: [HACKERS] Parallel Hash take II |
Date: | 2017-11-15 18:35:49 |
Message-ID: | 20171115183549.wy6k76n2wh7pntob@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017-11-15 08:37:11 -0500, Robert Haas wrote:
> On Tue, Nov 14, 2017 at 4:24 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> I agree, and I am interested in that subject. In the meantime, I
> >> think it'd be pretty unfair if parallel-oblivious hash join and
> >> sort-merge join and every other parallel plan get to use work_mem * p
> >> (and in some cases waste it with duplicate data), but Parallel Hash
> >> isn't allowed to do the same (and put it to good use).
> >
> > I'm not sure I care about fairness between pieces of code ;)
>
> I realize you're sort of joking here, but I think it's necessary to
> care about fairness between pieces of code.
Indeed I kinda was.
> I mean, the very first version of this patch that Thomas submitted was
> benchmarked by Rafia and had phenomenally good performance
> characteristics. That turned out to be because it wasn't respecting
> work_mem; you can often do a lot better with more memory, and
> generally you can't do nearly as well with less. To make comparisons
> meaningful, they have to be comparisons between algorithms that use
> the same amount of memory. And it's not just about testing. If we
> add an algorithm that will run twice as fast with equal memory but
> only allow it half as much, it will probably never get picked and the
> whole patch is a waste of time.
But this does bug me, and I think it's what made me pause here to make a
bad joke. The way that parallelism treats work_mem makes it even more
useless of a config knob than it was before. Parallelism, especially
after this patch, shouldn't compete / be benchmarked against a
single-process run with the same work_mem. To make it "fair" you need to
compare parallelism against a single threaded run with work_mem *
max_parallelism.
Thomas argues that this makes hashjoins be treated faily vis-a-vi
parallel-oblivious hash join etc. And I think he has somewhat of a
point. But I don't think it's quite right either: In several of these
cases the planner will not prefer the multi-process plan because it uses
more work_mem, it's a cost to be paid. Whereas this'll optimize towards
using work_mem * max_parallel_workers_per_gather amount of memory.
This makes it pretty much impossible to afterwards tune work_mem on a
server in a reasonable manner. Previously you'd tune it to something
like free_server_memory - (max_connections * work_mem *
80%_most_complex_query). Which you can't really do anymore now, you'd
also need to multiply by max_parallel_workers_per_gather. Which means
that you might end up "forcing" paralellism on a bunch of plans that'd
normally execute in too short a time to make parallelism worth it.
I don't really have a good answer to "but what should we otherwise do",
but I'm doubtful this is quite the right answer.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2017-11-15 18:39:58 | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) |
Previous Message | Mark Dilger | 2017-11-15 18:13:21 | Re: Updated macOS start scripts |