From: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
---|---|
To: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Consider the number of columns in the sort cost model |
Date: | 2024-10-29 02:47:27 |
Message-ID: | bea26aec-b9fd-4fac-a06e-bbad4f154287@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/28/24 16:48, Alena Rybakina wrote:
> On 23.10.2024 04:39, Andrei Lepikhov wrote:
>> On 15/10/2024 12:15, David Rowley wrote:
>> And the last patch is a demo of how I'm going to use the previous
>> three patches and add one more strategy to improve the order of
>> columns in the GROUP-BY clause.
>>
> To be honest, I didn’t find information about this in the code, but did
> I understand everything correctly?
Yes
>
> 2. I noticed that statistics of distinct values are calculated several
> times. Maybe I miss something, but this can be avoided if these
> statistics can be saved after calculation. For example, I saw that it is
> possible that you calculate the same statistic information for the same
> equivalence members in cost_incremental_sort and identify_sort_ecmember.
> Is it possible to store information in a structure and use it later?
Hmm, I don't see multiple calculations. em_distinct has made
specifically for this sake. Can you provide specific case and code lines?
>
> 3. I think you should initialize the variable ndist in your patch. I
> faced the compile complaining during your code compilation.
>
> costsize.c: In function ‘identify_sort_ecmember’:
> costsize.c:6694:42: error: ‘ndist’ may be used uninitialized
> [-Werror=maybe-uninitialized]
> 6694 | em->em_ndistinct = ndist;
> | ~~~~~~~~~~~~~~~~~^~~~~
> costsize.c:6680:33: note: ‘ndist’ was declared here
> 6680 | double ndist;
> | ^~~
> cc1: all warnings being treated as errors
> gmake[4]: *** [<builtin>: costsize.o] Error 1
I think you can just update your compiler. But I added the ndist
initialisation to make more compilers happy :).
> + Assert (node != NULL);
> +
> examine_variable(root, node, 0, &vardata);
> if (!HeapTupleIsValid(vardata.statsTuple))
> continue;
I don't think so. At least until you provide the case when the
get_sortgroupclause_expr function returns NULL.
That's more, remember - the patch 0004 here is just to show the
perspective and still under construction.
Anyway, thanks, I found out that the patch set doesn't apply correctly
because of 828e94c. So, see the new version in the attachment.
--
regards, Andrei Lepikhov
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Stabilise-incremental-sort-cost-calculation.patch | text/x-patch | 9.2 KB |
v2-0002-Consider-the-number-of-columns-in-the-cost-sort-m.patch | text/x-patch | 13.8 KB |
v2-0003-Consider-ndistinct-on-the-first-column-in-cost_so.patch | text/x-patch | 8.7 KB |
v2-0004-Add-GROUP-BY-strategy-put-the-most-distinct-colum.patch | text/x-patch | 7.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2024-10-29 03:06:25 | Re: EXPLAIN IndexOnlyScan shows disabled when enable_indexonlyscan=on |
Previous Message | Peter Smith | 2024-10-29 02:14:17 | Re: Pgoutput not capturing the generated columns |