Re: Statistics on array values

From: Marco Colli <collimarco91(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Statistics on array values
Date: 2020-02-02 14:38:57
Message-ID: CAFvCgN5JLC55y+Dk_5aMiVw-godC4pnVHALw=S0Lo7PnL++g8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Is one of those estimates way off reality, or is it only the conjunction
which is deranged?

The estimate is wrong *even with a single tag*, without the conjunction
(e.g. expected 3500, actual 20). Then the conjunction can make the bias
even worse...

On Sun, Feb 2, 2020 at 3:23 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Sun, Feb 02, 2020 at 03:18:19PM +0100, Marco Colli wrote:
> > Hello!
> >
> > Let's say that you have a simple query like the following on a large
> table
> > (for a multi-tenant application):
> > SELECT "subscribers".* FROM "subscribers" WHERE
> "subscribers"."project_id"
> > = 123 AND (tags @> ARRAY['de']::varchar[]);
> >
> > If you run EXPLAIN ANALYZE you can see that stats are completely wrong.
> > For example I get an expected count of 3,500 rows whereas the actual
> > result is 20 rows. This also results in bad query plans...
>
>
> https://www.postgresql.org/message-id/CAMkU%3D1z%2BQijUWAYgeqeyw%2BAvD7adPgOmEnY%2BOcTw6qDVFtD7cQ%40mail.gmail.com
> On Fri, Jan 10, 2020 at 12:12:52PM -0500, Jeff Janes wrote:
> > Why is the estimate off by so much? If you run a simple select, what the
> > actual and expected number of rows WHERE project_id = 12345? WHERE tags
> @>
> > '{crt:2018_11}'? Is one of those estimates way off reality, or is it
> only
> > the conjunction which is deranged?
>
> Could you respond to Jeff's inquiry ?
>
> Justin
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2020-02-02 17:11:05 Re: Statistics on array values
Previous Message Justin Pryzby 2020-02-02 14:23:21 Re: Statistics on array values