From: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andy Fan <zhihuifan1213(at)163(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me> |
Subject: | Hook for Selectivity Estimation in Query Planning |
Date: | 2025-03-05 08:41:42 |
Message-ID: | d1867416-198b-418a-be53-7df4e10aec62@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I would like to discuss the introduction of a hook for evaluating the
selectivity of an expression when searching for an optimal query plan.
This topic has been brought up in various discussions, for example, in [1].
Currently, extensions that interact with the optimiser can only add
their paths without the ability to influence the optimiser's decisions.
As a result, when developing an extension that implements a new type of
statistics (such as a histogram for composite types), utilises knowledge
from previously executed queries, or implements some system of
selectivity hints, we find ourselves writing a considerable amount of
code. To ensure the reliable operation of the extension, this may end up
in developing a separate optimiser or, at the very least, creating a
custom join search (refer to core.c in the pg_hint_plan extension for an
estimation of the amount of code required).
A hook for evaluating selectivity could streamline the development of
methods to improve selectivity evaluation, making it easier to create
new types of statistics and estimation methods (I would like to deal
with join clauses estimation). Considering the limited amount of code
involved and the upcoming code freeze, I propose adding such a hook to
PostgreSQL 18 to assess how it simplifies extension development.
This proposed hook would complement the existing path hooks without
overlapping in functionality. In my experience with implementing
adaptive features in enterprise solutions, I believe that additional
hooks could also be beneficial for estimating the number of groups and
the amount of memory allocated, which is currently based solely on
work_mem. However, these suggestions do not interfere with the current
proposal and could be considered later.
Critique:
In general, a hook for evaluating the number of rows appears to be a
more promising approach. It would allow the extension to access specific
RelOptInfo data, thus providing insights into where the evaluation takes
place within the plan. Consequently, this would enable a deeper
influence on the query plan choice. However, implementing such a hook
might be more invasive, requiring modifications to each cost function.
Additionally, it addresses a slightly different issue and can be
considered separately.
Attached is a patch containing the proposed hook code.
--
regards, Andrei Lepikhov
Attachment | Content-Type | Size |
---|---|---|
0001-Introduce-selectivity-hook.patch | text/plain | 2.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hayato Kuroda (Fujitsu) | 2025-03-05 08:43:09 | RE: Selectively invalidate caches in pgoutput module |
Previous Message | Corey Huinker | 2025-03-05 08:08:55 | Re: Statistics Import and Export |