Re: pgsql: Add some regression tests that exercise hash join code.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-committers <pgsql-committers(at)postgresql(dot)org>
Subject: Re: pgsql: Add some regression tests that exercise hash join code.
Date: 2017-12-04 20:21:42
Message-ID: 20171204202142.spwkwc545vvxmed7@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On 2017-12-04 12:28:56 +1300, Thomas Munro wrote:
> From 43eeea0bc35204440d262224b56efca958b33428 Mon Sep 17 00:00:00 2001
> From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
> Date: Mon, 4 Dec 2017 11:52:11 +1300
> Subject: [PATCH] Fix EXPLAIN ANALYZE of hash join when the leader doesn't
> execute.
>
> If a hash join appears in a parallel query, there may be no hash table
> available for explain.c to inspect even though a hash table may have been
> built in other processes. This could happen either because
> parallel_leader_participation was set to off or because the leader happened to
> hit the end of the outer relation immediately (even though the complete
> relation is not empty) and decided not to build the hash table.
>
> Commit bf11e7ee introduced a way for workers to exchange instrumentation via
> the DSM segment for Sort nodes even though they are not parallel-aware. This
> commit does the same for Hash nodes, so that explain.c has a way to find
> instrumentation data from an arbitrary participant that actually built the
> hash table.

Makes sense. Seems ok to only fix this inv11, even if present earlier.

> diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
> index 447f69d044e..1ffe635d66c 100644
> --- a/src/backend/commands/explain.c
> +++ b/src/backend/commands/explain.c
> @@ -2379,34 +2379,62 @@ show_sort_info(SortState *sortstate, ExplainState *es)
> static void
> show_hash_info(HashState *hashstate, ExplainState *es)
> {
> - HashJoinTable hashtable;
> + HashInstrumentation *hinstrument = NULL;
>
> - hashtable = hashstate->hashtable;
> + /*
> + * In a parallel query, the leader process may or may not have run the
> + * hash join, and even if it did it may not have built a hash table due to
> + * timing (if it started late it might have seen no tuples in the outer
> + * relation and skipped building the hash table). Therefore we have to be
> + * prepared to get instrumentation data from a worker if there is no hash
> + * table.
> + */
> + if (hashstate->hashtable)
> + {
> + hinstrument = (HashInstrumentation *)
> + palloc(sizeof(HashInstrumentation));
> + ExecHashGetInstrumentation(hinstrument, hashstate->hashtable);
> + }
> + else if (hashstate->shared_info)
> + {
> + SharedHashInfo *shared_info = hashstate->shared_info;
> + int i;
> +
> + /* Find the first worker that built a hash table. */
> + for (i = 0; i < shared_info->num_workers; ++i)
> + {
> + if (shared_info->hinstrument[i].nbatch > 0)
> + {
> + hinstrument = &shared_info->hinstrument[i];
> + break;
> + }
> + }
> + }

I can't think of a way that, before the parallel HJ patch, results in a
difference between the size of the hashtable between workers. Short of
some participant not having a hashtable at all, that is. Wwe also don't
currently display lookup information etc. So just more or less randomly
choosing one worker's seems ok.

> + case T_HashState:
> + /* even when not parallel-aware */

"..., for stats" or such maybe?

> + ExecHashEstimate((HashState *) planstate, e->pcxt);
> + break;
> case T_SortState:
> /* even when not parallel-aware */

I'd just update that comment as well.

> #endif /* NODEHASH_H */
> diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
> index e05bc04f525..6c322e57c00 100644
> --- a/src/include/nodes/execnodes.h
> +++ b/src/include/nodes/execnodes.h
> @@ -1980,6 +1980,29 @@ typedef struct GatherMergeState
> struct binaryheap *gm_heap; /* binary heap of slot indices */
> } GatherMergeState;
>
> +/* ----------------
> + * Values displayed by EXPLAIN ANALYZE
> + * ----------------
> + */
> +typedef struct HashInstrumentation
> +{
> + int nbuckets;
> + int nbuckets_original;
> + int nbatch;
> + int nbatch_original;
> + size_t space_peak;
> +} HashInstrumentation;

Maybe I'm being pedantic here, but I think it'd not hurt to explain what
these mean. It's obviously pretty clear what nbuckets/batch mean, but
the difference between original and not is far less clea.r

Looks reasonable these nitpicks aside.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2017-12-04 20:26:08 pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i
Previous Message Tom Lane 2017-12-04 16:51:50 pgsql: Support boolean columns in functional-dependency statistics.