Re: Showing applied extended statistics in explain Part 2

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tatsuro Yamada <yamatattsu(at)gmail(dot)com>
Cc: Tatsuro Yamada <tatsuro(dot)yamada(at)ntt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Showing applied extended statistics in explain Part 2
Date: 2024-07-15 14:15:36
Message-ID: 7e97951a-5ae8-42f0-a665-c71fdc921fa7@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/28/24 13:16, Tatsuro Yamada wrote:
> Hi Tomas and All,
>
> Attached file is a new patch including:
> 6) Add stats option to explain command
> 7) The patch really needs some docs (partly)
>
> >4) Add new node (resolve errors in cfbot and prepared statement)
>
> I tried adding a new node in pathnode.h, but it doesn't work well.
> So, it needs more time to implement it successfully because this is
> the first time to add a new node in it.
>

I'm not sure why it didn't work well, and I haven't tried adding the
struct myself so I might be missing something important, but m
assumption was the new struct would go to plannodes.h. The planning
works in phases:

parse -> build Path nodes -> pick cheapest Path -> create Plan

and it's the Plan that is printed by EXPLAIN. The pathnodes.h and
plannodes.h match this, so if it's expected to be in Plan it should go
to plannodes.h I think.

>
>> 8) Add regression test (stats_ext.sql)
>
>
> Actually, I am not yet able to add new test cases to stats_ext.sql.

Why is that not possible? Can you explain?

> Instead, I created a simple test (test.sql) and have attached it.
> Also, output.txt is the test result.
>
> To add new test cases to stats_ext.sql,
> I'd like to decide on a strategy for modifying it. In particular, there are
> 381 places where the check_estimated_rows function is used, so should I
> include the same number of tests, or should we include the bare minimum
> of tests that cover the code path? I think only the latter would be fine.
> Any advice is appreciated. :-D
>

I don't understand. My suggestion was to create a new function, similar
to check_estimated_rows(), that's get a query, do EXPLAIN and extract
the list of applied statistics. Similar to what check_estimated_rows()
does for number of rows.

I did not mean to suggest you modify check_estimated_rows() to extract
both the number of rows and statistics, nor to modify the existing tests
(that's not very useful, because there's only one extended statistics in
each of those tests, and by testing the estimate we implicitly test that
it's applied).

My suggestion is to add a couple new queries, with multiple statistics
and multiple clauses etc. And then test the patch on those. You could do
simple EXPLAIN (COSTS OFF), or add the new function to make it a bit
more stable (but maybe it's not worth it).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2024-07-15 14:29:32 Re: Patch bug: Fix jsonpath .* on Arrays
Previous Message Tomas Vondra 2024-07-15 14:03:34 Re: Showing applied extended statistics in explain Part 2