Re: Showing applied extended statistics in explain Part 2

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Tatsuro Yamada <yamatattsu(at)gmail(dot)com>, Masahiro(dot)Ikeda(at)nttdata(dot)com
Cc: tomas(dot)vondra(at)enterprisedb(dot)com, tatsuro(dot)yamada(at)ntt(dot)com, pgsql-hackers(at)postgresql(dot)org, Masao(dot)Fujii(at)nttdata(dot)com
Subject: Re: Showing applied extended statistics in explain Part 2
Date: 2024-11-18 21:15:55
Message-ID: 6e44ed21-2c45-4069-be36-16bdb23e2fd5@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/18/24 13:52, Ilia Evdokimov wrote:
> Hi everyone!
>
> Thank you for your work.
>
> 1) While exploring extended statistics, I encountered a bug that occurs
> when using EXPLAIN (STATS) with queries containing OR conditions:
>
> CREATE TABLE t (a int, b int, c int, d int);
> INSERT INTO t SELECT x/10+1, x, x + 10, x * 2 FROM
> generate_series(1,10000) g(x);
> CREATE STATISTICS ON a, b FROM t;
> CREATE STATISTICS ON c, d FROM t;
> ANALYZE;
>
> The following query works as expected:
>
> EXPLAIN (STATS) SELECT * FROM t WHERE a > 0 AND b > 0 AND c > 0 AND d > 0;
>                            QUERY PLAN
> ----------------------------------------------------------------
>  Seq Scan on t  (cost=0.00..255.00 rows=10000 width=16)
>    Filter: ((a > 0) AND (b > 0) AND (c > 0) AND (d > 0))
>    Ext Stats: public.t_a_b_stat  Clauses: ((a > 0) AND (b > 0))
>    Ext Stats: public.t_c_d_stat  Clauses: ((c > 0) AND (d > 0))
> (4 rows)
>
> However, when using OR conditions, the following query results in an error:
>
> EXPLAIN (ANALYZE, STATS) SELECT * FROM t WHERE a > 0 AND b > 0 OR c > 0
> AND d > 0;
> ERROR:  unrecognized node type: 314
>

I believe this is the issue I mentioned when I first posted the original
version of this patch:

> 2) The deparsing is modeled (i.e. copied) from how we deal with index
> quals, but it's having issues with nested OR clauses, because there
> are nested RestrictInfo nodes and the deparsing does not expect that.

In other words, the AND-clauses happens to be parsed like this:

BoolExpr (boolop=and)
RestrictInfo (clause=OpExpr, ...)
RestrictInfo (clause=OpExpr, ...)

And the deparse_expression() machinery does not expect RI nodes, which
is where the error message comes from.

An obvious solution would be to extend get_rule_expr() like this:

case T_RestrictInfo:
get_rule_expr((Node *) ((RestrictInfo *) node)->clause,
context, showimplicit);
break;

But I think this would be wrong - AFAIK ruleutils.c is meant to handle
these nodes. As the ruleutils.c header says:

Functions to convert stored expressions/querytrees back to
source text

And RestrictInfo surely is not meant to be stored anywhere - it's a
runtime only node, caching some data.

So I think the correct solution is to not pass any expressions with
RestrictInfo to deparse_expression(). Either by stripping the nodes, or
by not adding them at all.

The patch tries to do the stripping by maybe_extract_actual_clauses(),
but that only looks at the top node, and that is not sufficient here.
Maybe it would be possible to walk the whole tree, and remove all the
RestrictInfos nodes - including intermediate ones, not just the top. But
I'm not quite sure it wouldn't cause issues elsewhere (assuming it
modifies the existing nodes). It still feels a bit like fixing a problem
we shouldn't really have ...

The only alternative approach I can think of is to make sure we never
add any RestrictInfo nodes in these lists. But we build them for
selectivity estimation, and the RestrictInfo is meant for that.

So I'm a bit unsure what to do about this :-(

In any case, I think this shows the patch needs more tests.

> 2) It would be great if the STATS flag appeared as an option when
> pressing Tab during query input in the psql command-line interface.
>

True. Tab autocomplete would be nice.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-11-18 21:16:09 Re: UUID v7
Previous Message Bruce Momjian 2024-11-18 21:04:20 Re: Doc: typo in config.sgml