From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Add parallel-aware hash joins. |
Date: | 2017-12-30 16:16:34 |
Message-ID: | 20171230161634.3lpvgerqsysricaa@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Hi,
On 2017-12-31 02:51:26 +1300, Thomas Munro wrote:
> You mentioned that prairiedog sees the problem about one time in
> thirty. Would you mind checking if it goes away with this patch
> applied?
>
> --
> Thomas Munro
> http://www.enterprisedb.com
> From cbed027275039cc5debf8db89342a133a831c9ca Mon Sep 17 00:00:00 2001
> From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
> Date: Sun, 31 Dec 2017 02:03:07 +1300
> Subject: [PATCH] Fix EXPLAIN ANALYZE output for Parallel Hash.
>
> In a race case, EXPLAIN ANALYZE could fail to display correct nbatch and size
> information. Refactor so that participants report only on batches they worked
> on rather than trying to report on all of them, and teach explain.c to
> consider the HashInstrumentation object from all participants instead of
> picking the first one it can find. This should fix an occasional build farm
> failure in the "join" regression test.
This seems buggy independent of whether it fixes the issue on prairedog,
right? So I'm inclined to go ahead and just fix it...
> + /*
> + * Merge results from workers. In the parallel-oblivious case, the
> + * results from all participants should be identical, except where
> + * participants didn't run the join at all so have no data. In the
> + * parallel-aware case, we need to aggregate the results. Each worker may
> + * have seen a different subset of batches and we want to report the peak
> + * memory usage across all batches.
> + */
It's not necessarily the peak though, right? The largest batches might
not be read in at the same time. I'm fine with approximating it as such,
just want to make sure I understand.
> + if (hashstate->shared_info)
> {
> SharedHashInfo *shared_info = hashstate->shared_info;
> int i;
>
> - /* Find the first worker that built a hash table. */
> for (i = 0; i < shared_info->num_workers; ++i)
> {
> - if (shared_info->hinstrument[i].nbatch > 0)
> + HashInstrumentation *worker_hi = &shared_info->hinstrument[i];
> +
> + if (worker_hi->nbatch > 0)
> {
> - hinstrument = &shared_info->hinstrument[i];
> - break;
> + /*
> + * Every participant should agree on the buckets, so to be
> + * sure we have a value we'll just overwrite each time.
> + */
> + hinstrument.nbuckets = worker_hi->nbuckets;
> + hinstrument.nbuckets_original = worker_hi->nbuckets_original;
> + /*
> + * Normally every participant should agree on the number of
> + * batches too, but it's possible for a backend that started
> + * late and missed the whole join not to have the final nbatch
> + * number. So we'll take the largest number.
> + */
> + hinstrument.nbatch = Max(hinstrument.nbatch, worker_hi->nbatch);
> + hinstrument.nbatch_original = worker_hi->nbatch_original;
> + /*
> + * In a parallel-aware hash join, for now we report the
> + * maximum peak memory reported by any worker.
> + */
> + hinstrument.space_peak =
> + Max(hinstrument.space_peak, worker_hi->space_peak);
I bet pgindent will not like this layout.
Ho hum. Is this really, as you say above, an "aggregate the results"?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2017-12-30 21:59:26 | Re: pgsql: Add parallel-aware hash joins. |
Previous Message | Thomas Munro | 2017-12-30 13:51:26 | Re: pgsql: Add parallel-aware hash joins. |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2017-12-30 19:35:27 | Re: TAP test module - PostgresClient |
Previous Message | Tom Lane | 2017-12-30 15:45:19 | Re: TAP test module - PostgresClient |