From: | Hellmuth Vargas <hivs77(at)gmail(dot)com> |
---|---|
To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
Cc: | Matt Zagrabelny <mzagrabe(at)d(dot)umn(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rob Sargent <robjsargent(at)gmail(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: count(*) vs count(id) |
Date: | 2021-02-02 20:45:21 |
Message-ID: | CAN3Qy4q1VXQHbUsVH7+yKumdfyfcxQAaHTGS5XKgjZ-5PMOpRA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello list
My English is not very good, so I pretend that through the examples you
understand what I intend to expose
-- Recreate the query that is supposedly wrong
select calendar.entry, count(*)
from generate_series('2020-08-20'::date, '2020-09-15'::date, '1
day'::interval) as calendar(entry)
left join (values (1,'2020-08-28 09:44:11'::timestamp),
(2,'2020-08-28 10:44:11'::timestamp),
(3,'2020-08-29 11:44:11'::timestamp),
(4,'2020-09-01 02:44:11'::timestamp),
(5,'2020-09-02 03:44:11'::timestamp),
(6,'2020-09-02 04:44:11'::timestamp),
(7,'2020-09-03 05:44:11'::timestamp),
(8,'2020-09-04 06:44:11'::timestamp),
(10,'2020-09-04 07:44:11'::timestamp)) as call_records(id,timestamp)
on calendar.entry = call_records.timestamp::date
group by calendar.entry;
-- wrong???
entry | count
------------------------+-------
2020-08-20 00:00:00-05 | 1
2020-08-21 00:00:00-05 | 1
2020-08-22 00:00:00-05 | 1
2020-08-23 00:00:00-05 | 1
2020-08-24 00:00:00-05 | 1
2020-08-25 00:00:00-05 | 1
2020-08-26 00:00:00-05 | 1
2020-08-27 00:00:00-05 | 1
2020-08-28 00:00:00-05 | 2
2020-08-29 00:00:00-05 | 1
2020-08-30 00:00:00-05 | 1
2020-08-31 00:00:00-05 | 1
2020-09-01 00:00:00-05 | 1
2020-09-02 00:00:00-05 | 2
2020-09-03 00:00:00-05 | 1
2020-09-04 00:00:00-05 | 2
2020-09-05 00:00:00-05 | 1
2020-09-06 00:00:00-05 | 1
2020-09-07 00:00:00-05 | 1
2020-09-08 00:00:00-05 | 1
2020-09-09 00:00:00-05 | 1
2020-09-10 00:00:00-05 | 1
2020-09-11 00:00:00-05 | 1
2020-09-12 00:00:00-05 | 1
2020-09-13 00:00:00-05 | 1
2020-09-14 00:00:00-05 | 1
2020-09-15 00:00:00-05 | 1
-- In the count I will only consider the records of call_records
select calendar.entry, count(call_records.*)
from generate_series('2020-08-20'::date, '2020-09-15'::date, '1
day'::interval) as calendar(entry)
left join (values (1,'2020-08-28 09:44:11'::timestamp),
(2,'2020-08-28 10:44:11'::timestamp),
(3,'2020-08-29 11:44:11'::timestamp),
(4,'2020-09-01 02:44:11'::timestamp),
(5,'2020-09-02 03:44:11'::timestamp),
(6,'2020-09-02 04:44:11'::timestamp),
(7,'2020-09-03 05:44:11'::timestamp),
(8,'2020-09-04 06:44:11'::timestamp),
(10,'2020-09-04 07:44:11'::timestamp)) as call_records(id,timestamp)
on calendar.entry = call_records.timestamp::date
group by calendar.entry;
--- perfect
entry | count
------------------------+-------
2020-08-20 00:00:00-05 | 0
2020-08-21 00:00:00-05 | 0
2020-08-22 00:00:00-05 | 0
2020-08-23 00:00:00-05 | 0
2020-08-24 00:00:00-05 | 0
2020-08-25 00:00:00-05 | 0
2020-08-26 00:00:00-05 | 0
2020-08-27 00:00:00-05 | 0
2020-08-28 00:00:00-05 | 2
2020-08-29 00:00:00-05 | 1
2020-08-30 00:00:00-05 | 0
2020-08-31 00:00:00-05 | 0
2020-09-01 00:00:00-05 | 1
2020-09-02 00:00:00-05 | 2
2020-09-03 00:00:00-05 | 1
2020-09-04 00:00:00-05 | 2
2020-09-05 00:00:00-05 | 0
2020-09-06 00:00:00-05 | 0
2020-09-07 00:00:00-05 | 0
2020-09-08 00:00:00-05 | 0
2020-09-09 00:00:00-05 | 0
2020-09-10 00:00:00-05 | 0
2020-09-11 00:00:00-05 | 0
2020-09-12 00:00:00-05 | 0
2020-09-13 00:00:00-05 | 0
2020-09-14 00:00:00-05 | 0
2020-09-15 00:00:00-05 | 0
when placing * I want to bring all the join records between both tables and
when counting them of course there will be a row for the dates 2020-08-30
, 2020-08-31 so the call_records fields are null
select *
from generate_series('2020-08-20'::date, '2020-09-15'::date, '1
day'::interval) as calendar(entry)
left join (values (1,'2020-08-28 09:44:11'::timestamp),
(2,'2020-08-28 10:44:11'::timestamp),
(3,'2020-08-29 11:44:11'::timestamp),
(4,'2020-09-01 02:44:11'::timestamp),
(5,'2020-09-02 03:44:11'::timestamp),
(6,'2020-09-02 04:44:11'::timestamp),
(7,'2020-09-03 05:44:11'::timestamp),
(8,'2020-09-04 06:44:11'::timestamp),
(10,'2020-09-04 07:44:11'::timestamp)) as call_records(id,timestamp)
on calendar.entry = call_records.timestamp::date
entry | id | timestamp
------------------------+----+---------------------
2020-08-20 00:00:00-05 | |
2020-08-21 00:00:00-05 | |
2020-08-22 00:00:00-05 | |
2020-08-23 00:00:00-05 | |
2020-08-24 00:00:00-05 | |
2020-08-25 00:00:00-05 | |
2020-08-26 00:00:00-05 | |
2020-08-27 00:00:00-05 | |
2020-08-28 00:00:00-05 | 1 | 2020-08-28 09:44:11
2020-08-28 00:00:00-05 | 2 | 2020-08-28 10:44:11
2020-08-29 00:00:00-05 | 3 | 2020-08-29 11:44:11
2020-08-30 00:00:00-05 | |
2020-08-31 00:00:00-05 | |
2020-09-01 00:00:00-05 | 4 | 2020-09-01 02:44:11
2020-09-02 00:00:00-05 | 5 | 2020-09-02 03:44:11
2020-09-02 00:00:00-05 | 6 | 2020-09-02 04:44:11
2020-09-03 00:00:00-05 | 7 | 2020-09-03 05:44:11
2020-09-04 00:00:00-05 | 8 | 2020-09-04 06:44:11
2020-09-04 00:00:00-05 | 10 | 2020-09-04 07:44:11
2020-09-05 00:00:00-05 | |
2020-09-06 00:00:00-05 | |
2020-09-07 00:00:00-05 | |
2020-09-08 00:00:00-05 | |
2020-09-09 00:00:00-05 | |
2020-09-10 00:00:00-05 | |
2020-09-11 00:00:00-05 | |
2020-09-12 00:00:00-05 | |
2020-09-13 00:00:00-05 | |
2020-09-14 00:00:00-05 | |
2020-09-15 00:00:00-05 | |
--- counting
select entry, count(*)
from (
select *
from generate_series('2020-08-20'::date, '2020-09-15'::date, '1
day'::interval) as calendar(entry)
left join (values (1,'2020-08-28 09:44:11'::timestamp),
(2,'2020-08-28 10:44:11'::timestamp),
(3,'2020-08-29 11:44:11'::timestamp),
(4,'2020-09-01 02:44:11'::timestamp),
(5,'2020-09-02 03:44:11'::timestamp),
(6,'2020-09-02 04:44:11'::timestamp),
(7,'2020-09-03 05:44:11'::timestamp),
(8,'2020-09-04 06:44:11'::timestamp),
(10,'2020-09-04 07:44:11'::timestamp)) as call_records(id,timestamp)
on calendar.entry = call_records.timestamp::date
) as u
group by entry
entry | count
------------------------+-------
2020-08-20 00:00:00-05 | 1
2020-08-21 00:00:00-05 | 1
2020-08-22 00:00:00-05 | 1
2020-08-23 00:00:00-05 | 1
2020-08-24 00:00:00-05 | 1
2020-08-25 00:00:00-05 | 1
2020-08-26 00:00:00-05 | 1
2020-08-27 00:00:00-05 | 1
2020-08-28 00:00:00-05 | 2
2020-08-29 00:00:00-05 | 1
2020-08-30 00:00:00-05 | 1
2020-08-31 00:00:00-05 | 1
2020-09-01 00:00:00-05 | 1
2020-09-02 00:00:00-05 | 2
2020-09-03 00:00:00-05 | 1
2020-09-04 00:00:00-05 | 2
2020-09-05 00:00:00-05 | 1
2020-09-06 00:00:00-05 | 1
2020-09-07 00:00:00-05 | 1
2020-09-08 00:00:00-05 | 1
2020-09-09 00:00:00-05 | 1
2020-09-10 00:00:00-05 | 1
2020-09-11 00:00:00-05 | 1
2020-09-12 00:00:00-05 | 1
2020-09-13 00:00:00-05 | 1
2020-09-14 00:00:00-05 | 1
2020-09-15 00:00:00-05 | 1
El mar, 2 de feb. de 2021 a la(s) 03:31, Laurenz Albe (
laurenz(dot)albe(at)cybertec(dot)at) escribió:
> On Mon, 2021-02-01 at 19:14 -0600, Matt Zagrabelny wrote:
> > > > What is count(*) counting then? I thought it was rows.
> > >
> > > Yeah, but count(id) only counts rows where id isn't null.
> >
> > I guess I'm still not understanding it...
> >
> > I don't have any rows where id is null:
>
> Then the *result* of count(*) and count(id) will be the same.
>
> The asterisk in count(*) is misleading. Different from any other
> programming language that I know, the SQL standard has decided that
> you cannot have an aggregate function without arguments. You have
> to use the asterisk in that case.
>
> So count(*) really is count(), that is, it counts one for every
> row that it finds, no matter what the row contains.
>
> But count(id) includes a check: if "id IS NULL", it is not counted.
>
> If that condition is satisfied for all "id"s, you end up with
> the same count. But count(id) is more expensive, because it
> will perform this unnecessary NULLness check for each row.
>
> In short: use count(*) if you want to count rows, and use
> count(x) if you want to count all rows where x IS NOT NULL.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>
>
>
--
Cordialmente,
Ing. Hellmuth I. Vargas S.
From | Date | Subject | |
---|---|---|---|
Next Message | Joao Miguel Ferreira | 2021-02-02 21:58:15 | permission denied to create and drop user |
Previous Message | Tim Cross | 2021-02-02 19:14:01 | Re: ransomware |