Re: Expr. extended stats are skipped with equality operator

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Danny Shemesh <dany74q(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Subject: Re: Expr. extended stats are skipped with equality operator
Date: 2022-08-05 15:16:45
Message-ID: 20220805151645.GR19644@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 05, 2022 at 04:43:36PM +0300, Danny Shemesh wrote:
> 2. Less important, just a minor note - feel free to ignore - although the
> eq. operator above seems to be skipped when matching the ext. stats, I can
> work around this by using a CASE expression (fiddle
> <https://www.db-fiddle.com/f/wJZNH1rNwJSo3D5aByQiWX/1>);
> Building on the above example, we can:
> *create statistics s2 on (case x[1] when 1 then true else false end) from
> t1;*
> *explain analyze select * from t1 where (case x[1] when 1 then true else
> false end*
> *> Seq Scan on t1 (cost=0.00..1986.00 rows=100000 width=25) (actual
> time=0.011..33.721 rows=100000 loops=1)*
>
> What's a bit problematic here, though, is that if we mix other dependent
> columns to the extended stat, and specifically if we create an mcv,
> queries involving the CASE expression throw with `error: unknown clause
> type 130`, where clause type == T_CaseExpr.

> The second point for me would be that I've found it a bit non intuitive
> that creating an extended statistic can fail queries at query time; it

A reproducer for this:

CREATE TABLE t1(x int[], y float);
INSERT INTO t1 SELECT array[1], a FROM generate_series(1,99)a;
CREATE STATISTICS s2 ON (CASE x[1] WHEN 1 THEN true ELSE false END), y FROM t1;
ANALYZE t1;

explain analyze SELECT * FROM t1 WHERE CASE x[1] WHEN 1 THEN true ELSE false END AND y=1;
ERROR: unknown clause type: 134
\errverbose
ERROR: XX000: unknown clause type: 134
LOCATION: mcv_get_match_bitmap, mcv.c:1950

I'm not sure what Tomas will say, but XX000 errors from elog() are internal and
not intended to be user-facing, which is why there's no attempt to output a
friendly clause name. It might be that this wasn't reachable until statistics
on expressions were added in v14.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-08-05 15:55:33 Re: A proposal for shared memory based backup infrastructure
Previous Message Robert Haas 2022-08-05 14:54:26 Re: Cleaning up historical portability baggage