From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Junwang Zhao <zhjwpku(at)gmail(dot)com> |
Subject: | Some ExecSeqScan optimizations |
Date: | 2024-12-20 11:41:25 |
Message-ID: | CA+HiwqGaH-otvqW_ce-paL=96JvU4j+Xbuk+14esJNDwefdkOg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I’ve been looking into possible optimizations for ExecSeqScan() and
chatting with colleagues about cases where it shows up prominently in
analytics-style query plans.
For example, in queries like SELECT agg(somecol) FROM big_table WHERE
<condition>, ExecScan() often dominates the profile. Digging into it,
I found two potential sources of overhead:
1. Run-time checks for PlanState.qual and PlanState.ps_ProjInfo
nullness: these checks are done repeatedly, which seems unnecessary if
we know the values at plan init time.
2. Overhead from ExecScanFetch() when EvalPlanQual() isn’t relevant:
Andres pointed out that ExecScanFetch() introduces unnecessary
overhead even in the common case where EvalPlanQual() isn’t
applicable.
To address (1), I tried assigning specialized functions to
PlanState.ExecProcNode in ExecInitSeqScan() based on whether qual or
projInfo are NULL. Inspired by David Rowley’s suggestion to look at
ExecHashJoinImpl(), I wrote variants like ExecSeqScanNoQual() (for
qual == NULL) and ExecSeqScanNoProj() (for projInfo == NULL). These
call a local version of ExecScan() that lives in nodeSeqScan.c, marked
always-inline. This local copy takes qual and projInfo as arguments,
letting compilers inline and optimize unnecessary branches away.
For (2), the local ExecScan() copy avoids the generic ExecScanFetch()
logic, simplifying things further when EvalPlanQual() doesn’t apply.
That has the additional benefit of allowing SeqNext() to be called
directly instead of via an indirect function pointer. This reduces the
overhead of indirect calls and enables better compiler optimizations
like inlining.
Junwang Zhao helped with creating a benchmark to test the patch, the
results of which can be accessed in the spreadsheet at [1]. The
results show that the patch makes the latency of queries of shape
`SELECT agg(somecol or *) FROM big_table WHERE <condition>` generally
faster with up to 5% improvement in some cases.
Would love to hear thoughts.
--
Thanks, Amit Langote
[1] https://docs.google.com/spreadsheets/d/1AsJOUgIfSsYIJUJwbXk4aO9FVOFOrBCvrfmdQYkHIw4/edit?usp=sharing
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Introduce-optimized-ExecSeqScan-variants-for-tail.patch | application/octet-stream | 7.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-12-20 11:47:31 | downgrade some aclchk.c errors to internal |
Previous Message | Ranier Vilela | 2024-12-20 11:31:43 | Re: Can rs_cindex be < 0 for bitmap heap scans? |