Re: rows selectivity overestimate for @> operator for arrays

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: rows selectivity overestimate for @> operator for arrays
Date: 2022-05-27 18:02:11
Message-ID: 2694646.1653674531@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com> writes:
> so if we store mostly one element in array and they're almost all
> distinct then in tables with more then stats_target/0.0063 (~1.58M for
> maximum stats target 10000) rows we'll get 0.005 constant for selectivity.

Yeah. There's a comment in array_selfuncs.c about

* TODO: this estimate probably could be improved by using the distinct
* elements count histogram. For example, excepting the special case of
* "column @> '{}'", we can multiply the calculated selectivity by the
* fraction of nonempty arrays in the column.

but I'm not sure whether that's relevant here.

One thought is that if there is a pg_statistic row but it contains
no MCE list, we could assume that the column elements are all distinct
and see what sort of estimate that leads us to.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message James Pang (chaolpan) 2022-05-29 13:20:12 postgres backend process hang on " D " state
Previous Message Justin Pryzby 2022-05-27 17:56:47 Re: How to monitor Postgres real memory usage