Fixing Hash Join bug I caused with adf97c156

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: Fixing Hash Join bug I caused with adf97c156
Date: 2024-10-16 01:10:06
Message-ID: CAApHDvqo9eenEFXND5zZ9JxO_k4eTA4jKMGxSyjdTrsmYvnmZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Yesterday Andres mentioned to me that he's getting wrong results with
TPCH-Q02. Andres did the analysis to figure out that this was caused
by adf97c156 due to how I chose to store intermediate hash values when
the Hash Join has multiple join keys.

Per Andres' investigation, it seems what's going on is that
EEOP_PARAM_SET calls ExecEvalParamSet() and that sets the param value
from the ExprState->resvalue, the same location as the previous
hashing step stored its intermediate result.

It's probably not very good practice to store intermediate things in
ExprState->resvalue and expect them to still be there after evaluation
of steps that you might not have any control over, so the attached
adjust things to add a dedicated location for the intermediate hash
value and adjusts the step generation code to store all apart from the
final hashing step into that location.

I tested this with TPCH-Q02 and the results are correct again (when
compared with SET enable_hashjoin=0). I tried both with JIT on and
off.

The Hash Join in question looks like:

-> Hash Join (cost=137144.04..630694.61 rows=1 width=197)
Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND
((SubPlan 1) = partsupp.ps_supplycost))

I plan to push this soon, but if anyone wants to look over it, I'll
leave it here for a while before doing so.

David

Attachment Content-Type Size
0001-Don-t-store-intermediate-hash-values-in-ExprState-re.patch application/octet-stream 7.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-10-16 01:20:17 Re: Large expressions in indexes can't be stored (non-TOASTable)
Previous Message Michael Paquier 2024-10-16 00:43:48 Re: BF mamba failure