From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | dmigowski(at)ikoffice(dot)de, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18205: Performance regression with NOT NULL checks. |
Date: | 2023-11-19 21:30:49 |
Message-ID: | 1147963.1700429449@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2023-11-19 14:08:05 -0500, Tom Lane wrote:
>> So that results in not having to deconstruct most of the tuple,
>> whereas in the new code we do have to, thanks to b8d7f053c's
>> decision to batch all the variable-value-extraction work.
> Yea, I think we were aware at the time that this does have downsides - it's
> just that the worst case behaviour of *not* batching are much bigger than the
> worst-case downside of batching.
Agreed. Still ...
> We actually did add fastpaths for a few similar cases: ExecJustInnerVar() etc
> will just use slot_getattr(). These can be used when the result is just a
> single variable. However, the goal there was more to avoid "interpreter
> startup" overhead, rather than evaluation overhead.
Yeah. Also, if I'm reading the example appropriately, Daniel's case
*does* involve fetching more than a single column --- but the other ones
are up near the start so we didn't use to have to deform very much of
the tuple.
> What if we instead load 8 bytes of the bitmap into a uint64 before entering
> the loop, and shift an "index" mask into the bitmap by one each iteration
> through the loop?
Meh. Seems like a micro-optimization that does nothing for the big-O
problem. One thing to think about is that I suspect "all the columns
are null" is just a simple test case and not very representative of
the real-world problem. In the real case, probably quite a few of
the leading columns are non-null, which would make Daniel's issue
even worse because slot_deform_tuple would have to do significantly
more work that it didn't do before. Shaving cycles off the null-column
fast path would be proportionally less useful too.
It might well be that what you suggest is worth doing just to cut
the cost of slot_deform_tuple across the board, but I don't think
it's an answer to this complaint specifically.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2023-11-19 22:03:23 | Re: BUG #18207: Turkiye LC Setting Error |
Previous Message | Daniel Migowski | 2023-11-19 21:15:37 | AW: BUG #18205: Performance regression with NOT NULL checks. |