From: | Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Define STATS_MIN_ROWS for minimum rows of stats in ANALYZE |
Date: | 2025-01-03 13:45:21 |
Message-ID: | 24ed07ad-e857-47a8-9477-49fc19fb89c9@tantorlabs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10.12.2024 16:32, Ilia Evdokimov wrote:
>
> On 09.12.2024 16:10, Ilia Evdokimov wrote:
>> Hi hackers,
>>
>> The repeated use of the number 300 in the ANALYZE-related code
>> creates redundancy and relies on scattered, sometimes unclear,
>> comments to explain its purpose. This can make the code harder to
>> understand, especially for new contributors who might not immediately
>> understand its significance. To address this, I propose introducing a
>> macro STATS_MIN_ROWS to represent this value and consolidating its
>> explanation in a single place, making the code more consistent and
>> readable.
>>
>> --
>> Best regards,
>> Ilia Evdokimov,
>> Tantor Labs LLC.
>
>
> Hi everyone,
>
> Currently, the value 300 is used as the basis for determining the
> number of rows sampled during ANALYZE, both for single-column and
> extended statistics. While this value has a well-established rationale
> for single-column statistics, its suitability for extended statistics
> remains uncertain, as no specific research has confirmed that this is
> an optimal choice for them. To better reflect this distinction, I
> propose introducing two macros: STATS_MIN_ROWS for single-column
> statistics and EXT_STATS_MIN_ROWS for extended statistics.
>
> This change separates the concerns of single-column and extended
> statistics sampling, making the code more explicit and easier to adapt
> if future research suggests a different approach for extended
> statistics. The values remain the same for now, but the introduction
> of distinct macros improves clarity and prepares the codebase for
> potential refinements.
>
> Does this seem like a reasonable approach to handling these differences?
>
> --
> Best regards,
> Ilia Evdokimov,
> Tantor Labs LLC.
Hi everyone,
In my opinion, it is more appropriate to define||EXT_STATS_MIN_ROWS as
STATS_MIN_ROWS. I also reverted some of the code comments and rewrote
others. I attached patch.
Any thoughts?
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Define-macros-for-minimum-rows-of-stats.patch | text/x-patch | 7.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ilia Evdokimov | 2025-01-03 14:09:02 | Remove unused rel parameter in lookup_var_attr_stats |
Previous Message | Robert Haas | 2025-01-03 13:44:22 | Re: magical eref alias names |