From: | Zhang Mingli <zmlpostgres(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: Improve heapgetpage() performance, overhead from serializable |
Date: | 2023-07-17 01:55:07 |
Message-ID: | 4281c260-ec5e-40aa-bcc9-5e3888ca4a48@Spark |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Regards,
Zhang Mingli
On Jul 16, 2023 at 09:57 +0800, Andres Freund <andres(at)anarazel(dot)de>, wrote:
> Hi,
>
> Several loops which are important for query performance, like heapgetpage()'s
> loop over all tuples, have to call functions like
> HeapCheckForSerializableConflictOut() and PredicateLockTID() in every
> iteration.
>
> When serializable is not in use, all those functions do is to to return. But
> being situated in a different translation unit, the compiler can't inline
> (without LTO at least) the check whether serializability is needed. It's not
> just the function call overhead that's noticable, it's also that registers
> have to be spilled to the stack / reloaded from memory etc.
>
> On a freshly loaded pgbench scale 100, with turbo mode disabled, postgres
> pinned to one core. Parallel workers disabled to reduce noise. All times are
> the average of 15 executions with pgbench, in a newly started, but prewarmed
> postgres.
>
> SELECT * FROM pgbench_accounts OFFSET 10000000;
> HEAD:
> 397.977
>
> removing the HeapCheckForSerializableConflictOut() from heapgetpage()
> (incorrect!), to establish the baseline of what serializable costs:
> 336.695
>
> pulling out CheckForSerializableConflictOutNeeded() from
> HeapCheckForSerializableConflictOut() in heapgetpage(), and avoiding calling
> HeapCheckForSerializableConflictOut() in the loop:
> 339.742
>
> moving the loop into a static inline function, marked as pg_always_inline,
> called with static arguments for always_visible, check_serializable:
> 326.546
>
> marking the always_visible, !check_serializable case likely():
> 322.249
>
> removing TestForOldSnapshot() calls, which we pretty much already decided on:
> 312.987
>
>
> FWIW, there's more we can do, with some hacky changes I got the time down to
> 273.261, but the tradeoffs start to be a bit more complicated. And 397->320ms
> for something as core as this, is imo worth considering on its own.
>
>
>
>
> Now, this just affects the sequential scan case. heap_hot_search_buffer()
> shares many of the same pathologies. I find it a bit harder to improve,
> because the compiler's code generation seems to switch between good / bad with
> changes that seems unrelated...
>
>
> I wonder why we haven't used PageIsAllVisible() in heap_hot_search_buffer() so
> far?
>
>
> Greetings,
>
> Andres Freund
LGTM and I have a fool question:
if (likely(all_visible))
{
if (likely(!check_serializable))
scan->rs_ntuples = heapgetpage_collect(scan, snapshot, page, buffer,
block, lines, 1, 0);
else
scan->rs_ntuples = heapgetpage_collect(scan, snapshot, page, buffer,
block, lines, 1, 1);
}
else
{
if (likely(!check_serializable))
scan->rs_ntuples = heapgetpage_collect(scan, snapshot, page, buffer,
block, lines, 0, 0);
else
scan->rs_ntuples = heapgetpage_collect(scan, snapshot, page, buffer,
block, lines, 0, 1);
Does it make sense to combine if else condition and put it to the incline function’s param?
Like:
scan->rs_ntuples = heapgetpage_collect(scan, snapshot, page, buffer,
block, lines, all_visible, check_serializable);
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2023-07-17 02:17:40 | Re: logicalrep_message_type throws an error |
Previous Message | Michael Paquier | 2023-07-17 01:27:22 | Re: Requiring recovery.signal or standby.signal when recovering with a backup_label |