Re: Histogram question.

From: Steve Midgley <science(at)misuse(dot)org>
To: Jian He <hejian(dot)mark(at)gmail(dot)com>
Cc: pgsql-sql <pgsql-sql(at)lists(dot)postgresql(dot)org>
Subject: Re: Histogram question.
Date: 2022-04-05 15:14:57
Message-ID: CAJexoS+NPAb4BLRSRYP=qNwaq3A591D6run1Qwa85mV_10ehYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Tue, Apr 5, 2022 at 7:35 AM Jian He <hejian(dot)mark(at)gmail(dot)com> wrote:

> Queries in PostgreSQL: 2. Statistics : Postgres Professional
> <https://postgrespro.com/blog/pgsql/5969296>
>
>
>
> SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
> text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
> WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no';
>
> *return 0.6762. *
>
> SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
> text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
> WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no' AND v >
> '30C';
>
> return *0.2127*
>
> SELECT round( reltuples * ( 0.2127 -- from most common values + (1 -
> 0.6762 - 0) * (49 / 100.0) -- from histogram )) FROM pg_class WHERE
> relname = 'boarding_passes';
>
> the above mentioned query, the part I don't understand is *49/100.*
>
>
I believe the exercise is intended to create a set of histograms based on
data values over a series of intervals. The 49/100 (if I'm reading the
source material correctly) refers to finding all the boarding passes in the
lower 49 of 100 intervals. I didn't bother to read what the interval
definition is, but I think that's what the "49" is referring to..

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Jian He 2022-04-06 04:07:55 Does postgresql know the check condition is valid or not. or can check deduce from multiple conditions
Previous Message Jian He 2022-04-05 14:34:44 Histogram question.