From: | Daniel Migowski <dmigowski(at)ikoffice(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | AW: BUG #18205: Performance regression with NOT NULL checks. |
Date: | 2023-11-19 21:15:37 |
Message-ID: | 41ED3F5450C90F4D8381BC4D8DF6BBDC0174014A20@EXCHANGESERVER.ikoffice.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Thank you for the detailed explanation. It surely has a reason to remove the previously used short-circuit in slot_getattr, at least in the special case when we
And yes, in my real world use case which I tried to understand I have a wide table with 81 columns, and I am using column 1,43,18,75 and filter by attribute 82,42, and 24.
Don't know how to handle this, maybe I should rework the layout of all my tables and move the mostly used columns to the beginning.
-----Ursprüngliche Nachricht-----
Von: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Gesendet: Sonntag, 19. November 2023 20:08
An: Daniel Migowski <dmigowski(at)ikoffice(dot)de>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org; Andres Freund <andres(at)anarazel(dot)de>
Betreff: Re: BUG #18205: Performance regression with NOT NULL checks.
PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> I found out that selecting from a wide table with a single not-null
> where clause leads to severe performance regression when upgrading
> from PostgreSQL
> 9.5 to PostgreSQL 15.
I spent some time poking into this. "git bisect" pins the blame on
commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: Tue Mar 14 15:45:36 2017 -0700
Faster expression evaluation and targetlist projection.
The time needed for the seqscan gets about 50% worse at that commit (on my test machine, anyway):
-> Seq Scan on testtable t (cost=0.00..51892.80 rows=9730 width=4) (actual time=0.010..204.937 rows=6000000 loops=1)
Filter: (y IS NULL)
versus
-> Seq Scan on testtable t (cost=0.00..51892.80 rows=9730 width=4) (actual time=0.013..317.069 rows=6000000 loops=1)
Filter: (y IS NULL)
"perf" says that the extra time is mostly spent in slot_deform_tuple's inner loop:
for (; attnum < natts; attnum++)
{
Form_pg_attribute thisatt = att[attnum];
if (hasnulls && att_isnull(attnum, bp))
{
values[attnum] = (Datum) 0;
isnull[attnum] = true;
slow = true; /* can't use attcacheoff anymore */
continue;
}
...
which confused me, because that code doesn't look materially different in v10 than 9.6. I eventually realized that the reason is that we reach slot_deform_tuple with natts = 26 in the new code, but in the old code we do not, thanks to this short-circuit in slot_getattr:
/*
* check if target attribute is null: no point in groveling through tuple
*/
if (HeapTupleHasNulls(tuple) && att_isnull(attnum - 1, tup->t_bits))
{
*isnull = true;
return (Datum) 0;
}
So that results in not having to deconstruct most of the tuple, whereas in the new code we do have to, thanks to b8d7f053c's decision to batch all the variable-value-extraction work.
This is a pretty narrow corner case: it would only matter if the column you're testing for null-ness is far past any other column the query needs to fetch. So I'm not sure that it's worth doing anything about. You could imagine special processing for
NullTest-on-a-simple-Var: exclude the Var from the ones that we extract normally and instead compile it into some sort of "is column k NULL" test on the HeapTuple. But that seems messy, and it could be a significant pessimization for storage methods that aren't like heapam.
On the whole I'm inclined to say "sorry, that's the price of progress". But it is a bit sad that a patch intended to be a performance win shows a loss this big on a (presumably) real-world case.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-11-19 21:30:49 | Re: BUG #18205: Performance regression with NOT NULL checks. |
Previous Message | Andres Freund | 2023-11-19 21:12:03 | Re: BUG #18205: Performance regression with NOT NULL checks. |