Re: incorrect results and different plan with 2 very similar queries

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Dave Cramer <davecramer(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: incorrect results and different plan with 2 very similar queries
Date: 2024-03-28 22:04:33
Message-ID: 750b4884-7276-49d5-b577-364f1f3061a2@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/27/24 23:10, Dave Cramer wrote:
> Dave Cramer
>
>
> On Wed, 27 Mar 2024 at 17:57, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
>> On Thu, 28 Mar 2024 at 10:33, Dave Cramer <davecramer(at)gmail(dot)com> wrote:
>>> There is a report on the pgjdbc github JDBC Driver shows erratic
>> behavior when filtering on CURRENT_DATE · pgjdbc/pgjdbc · Discussion #3184 (
>> github.com)
>>>
>>> Here are the plans.
>>>
>>> JDBC - Nested Loop (incorrect result)
>>>
>>> Index Cond: (mutation >= ((CURRENT_DATE -
>> '1971-12-31'::date) - 28))
>>
>>> JDBC - Hash Right (correct result)
>>>
>>> Recheck Cond: (mutation >= ((CURRENT_DATE -
>> '1971-12-31'::date) - 29))
>>
>> I don't see any version details or queries, but going by the
>> conditions above, the queries don't appear to be the same, so
>> different results aren't too surprising and not a demonstration that
>> there's any sort of bug.
>>
>
> Sorry, you are correct. Version is 12.14. Here is the query
>
> SELECT
> p.partseqno_i
> , p.partno
> , p.partmatch
> , pfe.average_price
> , pfe.sales_price
> , pfe.purch_price
> , pfe.average_price_2
> , pfe.avg_repair_cost
> , pfe.average_price_func
> , pfe.fsv
> , pfe.fsv_func
> , p.status
>
> FROM part p
> LEFT JOIN part_fa_entity pfe ON (p.partno = pfe.partno)
> WHERE 1=1
> AND (p.mutation >= (CURRENT_DATE - '1971-12-31'::date)-27) ORDER BY p.partno
>

I guess the confusing bit is that the report does not claim that those
queries are expected to produce the same result, but that the parameter
value affects which plan gets selected, and one of those plans produces
incorrect result.

I think the simplest explanation might be that one of the indexes on
part_fa_entity is corrupted and fails to lookup some rows by partno.
That would explain why the plan with seqscan works fine, but nestloop
with index scan is missing these rows.

They might try a couple things:

1) set enable_nestloop=off, see if the results get correct

2) try bt_index_check on i_39773, might notice some corruption

3) rebuild the index

If it's not this, they'll need to build a reproducer. It's really
difficult to deduce what's going on just from query plans for different
parameter values.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-03-28 22:10:33 Re: Popcount optimization using AVX512
Previous Message Amonson, Paul D 2024-03-28 22:03:04 RE: Popcount optimization using AVX512