From: | Steve Midgley <science(at)misuse(dot)org> |
---|---|
To: | Jian He <hejian(dot)mark(at)gmail(dot)com> |
Cc: | pgsql-sql <pgsql-sql(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Histogram question. |
Date: | 2022-04-05 15:14:57 |
Message-ID: | CAJexoS+NPAb4BLRSRYP=qNwaq3A591D6run1Qwa85mV_10ehYA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Tue, Apr 5, 2022 at 7:35 AM Jian He <hejian(dot)mark(at)gmail(dot)com> wrote:
> Queries in PostgreSQL: 2. Statistics : Postgres Professional
> <https://postgrespro.com/blog/pgsql/5969296>
>
>
>
> SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
> text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
> WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no';
>
> *return 0.6762. *
>
> SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
> text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
> WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no' AND v >
> '30C';
>
> return *0.2127*
>
> SELECT round( reltuples * ( 0.2127 -- from most common values + (1 -
> 0.6762 - 0) * (49 / 100.0) -- from histogram )) FROM pg_class WHERE
> relname = 'boarding_passes';
>
> the above mentioned query, the part I don't understand is *49/100.*
>
>
I believe the exercise is intended to create a set of histograms based on
data values over a series of intervals. The 49/100 (if I'm reading the
source material correctly) refers to finding all the boarding passes in the
lower 49 of 100 intervals. I didn't bother to read what the interval
definition is, but I think that's what the "49" is referring to..
From | Date | Subject | |
---|---|---|---|
Next Message | Jian He | 2022-04-06 04:07:55 | Does postgresql know the check condition is valid or not. or can check deduce from multiple conditions |
Previous Message | Jian He | 2022-04-05 14:34:44 | Histogram question. |