Re: Define STATS_MIN_ROWS for minimum rows of stats in ANALYZE

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Define STATS_MIN_ROWS for minimum rows of stats in ANALYZE
Date: 2024-12-10 13:32:27
Message-ID: 93af9682-d7aa-49a9-842e-7599e8d36028@tantorlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 09.12.2024 16:10, Ilia Evdokimov wrote:
> Hi hackers,
>
> The repeated use of the number 300 in the ANALYZE-related code creates
> redundancy and relies on scattered, sometimes unclear, comments to
> explain its purpose. This can make the code harder to understand,
> especially for new contributors who might not immediately understand
> its significance. To address this, I propose introducing a macro
> STATS_MIN_ROWS to represent this value and consolidating its
> explanation in a single place, making the code more consistent and
> readable.
>
> --
> Best regards,
> Ilia Evdokimov,
> Tantor Labs LLC.

Hi everyone,

Currently, the value 300 is used as the basis for determining the number
of rows sampled during ANALYZE, both for single-column and extended
statistics. While this value has a well-established rationale for
single-column statistics, its suitability for extended statistics
remains uncertain, as no specific research has confirmed that this is an
optimal choice for them. To better reflect this distinction, I propose
introducing two macros: STATS_MIN_ROWS for single-column statistics and
EXT_STATS_MIN_ROWS for extended statistics.

This change separates the concerns of single-column and extended
statistics sampling, making the code more explicit and easier to adapt
if future research suggests a different approach for extended
statistics. The values remain the same for now, but the introduction of
distinct macros improves clarity and prepares the codebase for potential
refinements.

Does this seem like a reasonable approach to handling these differences?

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment Content-Type Size
v2-0001-Define-macros-for-minimum-rows-of-stats.patch text/x-patch 7.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2024-12-10 13:43:14 Re: [PATCH] Support Int64 GUCs
Previous Message wenhui qiu 2024-12-10 13:32:17 Re: [PATCH] Support Int64 GUCs