From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Richard Guo <guofenglinux(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Wrong results from Parallel Hash Full Join |
Date: | 2023-04-12 18:59:11 |
Message-ID: | CAAKRu_Z=rzXjxVtA7gPe2bmBk7F6e4_pe8iT5DmnYFOgWPxMGQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 12, 2023 at 2:14 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2023-04-12 10:57:17 -0400, Melanie Plageman wrote:
> > HeapTupleHeaderHasMatch() checks if HEAP_TUPLE_HAS_MATCH is set.
> >
> > In htup_details.h, you will see that HEAP_TUPLE_HAS_MATCH is defined as
> > HEAP_ONLY_TUPLE
> > /*
> > * HEAP_TUPLE_HAS_MATCH is a temporary flag used during hash joins. It is
> > * only used in tuples that are in the hash table, and those don't need
> > * any visibility information, so we can overlay it on a visibility flag
> > * instead of using up a dedicated bit.
> > */
> > #define HEAP_TUPLE_HAS_MATCH HEAP_ONLY_TUPLE /* tuple has a join match */
> >
> > If you redefine HEAP_TUPLE_HAS_MATCH as something that isn't already
> > used, say 0x1800, the query returns correct results.
> > [...]
> > The question is, why does this only happen for a parallel full hash join?
>
> I'd guess that PHJ code is missing a HeapTupleHeaderClearMatch() somewhere,
> but the non-parallel case isn't.
Indeed. Thanks! This diff fixes the case Richard provided.
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index a45bd3a315..54c06c5eb3 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -1724,6 +1724,7 @@ retry:
/* Store the hash value in the HashJoinTuple header. */
hashTuple->hashvalue = hashvalue;
memcpy(HJTUPLE_MINTUPLE(hashTuple), tuple, tuple->t_len);
+ HeapTupleHeaderClearMatch(HJTUPLE_MINTUPLE(hashTuple));
/* Push it onto the front of the bucket's list */
ExecParallelHashPushTuple(&hashtable->buckets.shared[bucketno],
I will propose a patch that includes this change and a test.
I just want to convince myself that ExecParallelHashTableInsertCurrentBatch()
covers the non-batch 0 cases and we don't need to add something to
sts_puttuple().
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2023-04-12 19:43:43 | Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert |
Previous Message | Andrew Dunstan | 2023-04-12 18:57:46 | Re: Direct I/O |