Quick Links

Re: rows selectivity overestimate for @> operator for arrays

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: rows selectivity overestimate for @> operator for arrays
Date:	2022-05-27 18:02:11
Message-ID:	2694646.1653674531@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com> writes:
> so if we store mostly one element in array and they're almost all
> distinct then in tables with more then stats_target/0.0063 (~1.58M for
> maximum stats target 10000) rows we'll get 0.005 constant for selectivity.

Yeah. There's a comment in array_selfuncs.c about

* TODO: this estimate probably could be improved by using the distinct
* elements count histogram. For example, excepting the special case of
* "column @> '{}'", we can multiply the calculated selectivity by the
* fraction of nonempty arrays in the column.

but I'm not sure whether that's relevant here.

One thought is that if there is a pg_statistic row but it contains
no MCE list, we could assume that the column elements are all distinct
and see what sort of estimate that leads us to.

regards, tom lane

In response to

rows selectivity overestimate for @> operator for arrays at 2022-05-27 13:39:22 from Alexey Ermakov

Browse pgsql-performance by date

	From	Date	Subject
Next Message	James Pang (chaolpan)	2022-05-29 13:20:12	postgres backend process hang on " D " state
Previous Message	Justin Pryzby	2022-05-27 17:56:47	Re: How to monitor Postgres real memory usage