Re: Bug or incorrect usage?

From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Jordan Gigov <coladict(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Bug or incorrect usage?
Date: 2016-03-01 13:31:09
Message-ID: CAKOSWNm3mXd+VpPgsST=-rM4k24YU7A+AmJ8XXEHTeKA-nnsaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 3/1/16, Jordan Gigov <coladict(at)gmail(dot)com> wrote:
> So, I was trying to use left joins to get bot the total number of
> corresponding rows and the number in a specific subset, but it seems I
> can't do that in 9.4.6 (the changelogs after that don't suggest this has
> changed).
> This my small-scale test for it:
>
> CREATE TABLE somedata (id bigserial, something text, PRIMARY KEY (id));
> CREATE TABLE moredata (id bigserial, otherid bigint, status varchar,
> PRIMARY KEY (id), FOREIGN KEY (otherid) REFERENCES somedata(id));
> INSERT INTO somedata(something) VALUES ('Example 1'),('Example
> 2'),('Example 3');
> INSERT INTO moredata(otherid,status)
> VALUES(1,'NEW'),(3,'NEW'),(3,'OLD'),(3,'DEPRECATED');
>
> SELECT somedata.id, somedata.something, count(md1.id), count(md2.id)
> FROM somedata
> LEFT JOIN moredata md1 ON (md1.otherid = somedata.id)
> LEFT JOIN moredata md2 ON (md2.otherid = somedata.id AND md2.status NOT IN
> ('OLD', 'DEPRECATED'))
> GROUP BY somedata.id, somedata.something;
>
> What I expected the output to be is:
> id | something | count | count
> ----+-----------+-------+-------
> 2 | Example 2 | 0 | 0
> 3 | Example 3 | 3 | 1
> 1 | Example 1 | 1 | 1
> (3 rows)
>
> Instead I got:
> id | something | count | count
> ----+-----------+-------+-------
> 2 | Example 2 | 0 | 0
> 3 | Example 3 | 3 | 3
> 1 | Example 1 | 1 | 1
> (3 rows)

Ok, try to get result before grouping:
postgres=# SELECT
postgres-# somedata.id
postgres-# ,somedata.something
postgres-# ,md1.id
postgres-# ,md2.id
postgres-#
postgres-# -- ,count(md1.id)
postgres-# -- ,count(md2.id)
postgres-# FROM somedata
postgres-# LEFT JOIN moredata md1 ON (md1.otherid = somedata.id)
postgres-# LEFT JOIN moredata md2 ON (md2.otherid = somedata.id AND
md2.status NOT IN('OLD', 'DEPRECATED'))
postgres-# -- GROUP BY somedata.id, somedata.something;
postgres-# ;
id | something | id | id
----+-----------+----+----
1 | Example 1 | 1 | 1
3 | Example 3 | 4 | 2
3 | Example 3 | 3 | 2
3 | Example 3 | 2 | 2
2 | Example +| |
| 2 | |
(5 rows)

So it is very clear why count(md1.id) for somedata.id=3 gives result of 3...

> Running the searches with individual joins:
>
> SELECT somedata.id, somedata.something, count(md2.id)
> FROM somedata
> LEFT JOIN moredata md2 ON (md2.otherid = somedata.id AND md2.status NOT IN
> ('OLD', 'DEPRECATED'))
> GROUP BY somedata.id, somedata.something;
> id | something | count
> ----+-----------+-------
> 2 | Example 2 | 0
> 3 | Example 3 | 1
> 1 | Example 1 | 1

Here is a non-grouped result (without the first joining there are just
few rows for grouping):

postgres=# SELECT somedata.id, somedata.something, md2.id --count(md2.id)
postgres-# FROM somedata
postgres-# LEFT JOIN moredata md2 ON (md2.otherid = somedata.id AND
md2.status NOT IN ('OLD', 'DEPRECATED'))
postgres-# -- GROUP BY somedata.id, somedata.something;
postgres-# ;
id | something | id
----+-----------+----
1 | Example 1 | 1
3 | Example 3 | 2
2 | Example +|
| 2 |
(3 rows)

> SELECT somedata.id, somedata.something, count(md1.id)
> FROM somedata
> LEFT JOIN moredata md1 ON (md1.otherid = somedata.id)
> GROUP BY somedata.id, somedata.something;
> id | something | count
> ----+-----------+-------
> 2 | Example 2 | 0
> 3 | Example 3 | 3
> 1 | Example 1 | 1
>
>
> Am I misunderstanding something, or is some optimization messing-up my
> data?

So it works as expected.

--
Best regards,
Vitaly Burovoy

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2016-03-02 01:52:44 Re: BUG #13984: Multiple rows with the same primary key value exist.
Previous Message Jordan Gigov 2016-03-01 12:29:11 Bug or incorrect usage?