Re: Fixing Hash Join bug I caused with adf97c156

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Fixing Hash Join bug I caused with adf97c156
Date: 2024-10-16 01:22:29
Message-ID: f87a0f45-10b5-4da1-93aa-e2bd3f6fcdac@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On 16.10.2024 04:10, David Rowley wrote:
> Yesterday Andres mentioned to me that he's getting wrong results with
> TPCH-Q02. Andres did the analysis to figure out that this was caused
> by adf97c156 due to how I chose to store intermediate hash values when
> the Hash Join has multiple join keys.
>
> Per Andres' investigation, it seems what's going on is that
> EEOP_PARAM_SET calls ExecEvalParamSet() and that sets the param value
> from the ExprState->resvalue, the same location as the previous
> hashing step stored its intermediate result.
>
> It's probably not very good practice to store intermediate things in
> ExprState->resvalue and expect them to still be there after evaluation
> of steps that you might not have any control over, so the attached
> adjust things to add a dedicated location for the intermediate hash
> value and adjusts the step generation code to store all apart from the
> final hashing step into that location.
>
> I tested this with TPCH-Q02 and the results are correct again (when
> compared with SET enable_hashjoin=0). I tried both with JIT on and
> off.
>
> The Hash Join in question looks like:
>
> -> Hash Join (cost=137144.04..630694.61 rows=1 width=197)
> Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND
> ((SubPlan 1) = partsupp.ps_supplycost))
>
> I plan to push this soon, but if anyone wants to look over it, I'll
> leave it here for a while before doing so.
>

Thank you for the work on this issue!

I haven't noticed anything wrong yet. I think it's worth adding a test
to regression tests, isn't it?

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-10-16 01:24:32 Re: Large expressions in indexes can't be stored (non-TOASTable)
Previous Message Nathan Bossart 2024-10-16 01:20:17 Re: Large expressions in indexes can't be stored (non-TOASTable)