Re: Unexpected results when joining on date fields

From: Rick Genter <rick(dot)genter(at)gmail(dot)com>
To: rick(dot)genter(at)gmail(dot)com
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Unexpected results when joining on date fields
Date: 2011-07-12 15:23:01
Message-ID: CADie1rzLnHsYy9H=PTpNqOKvhXZYDoDD3vUABFYGN_==igHEow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I don't think you understand what JOIN does. Think of it as a double-nested
FOR loop: for each record that has the value on the left side of the JOIN,
it will match all records on the right side of the JOIN that meet the ON
criteria. For example, if I have two tables:

A (i int, j int):

i j
1 1
2 1
3 2
4 2
5 3
6 3

and

B (k int, j int)
k j
10 1
11 1
12 2
13 2
14 3
15 3

Then if I do

SELECT COUNT(*) FROM A JOIN B ON A.j = B.j

I'll get 12. Each record in A matches 2 records in B on the value of j.
Study the following transcript:

bash-3.2$ bin/psql -d g2_master
Password:
psql (8.4.4)
Type "help" for help.

g2_master=# CREATE TABLE A (i int, j int);
CREATE TABLE
g2_master=# CREATE TABLE B (k int, j int);
CREATE TABLE
g2_master=# INSERT INTO A VALUES (1, 1), (2, 1), (3, 2), (4, 2), (5, 3), (6,
3);
INSERT 0 6
g2_master=# INSERT INTO B VALUES (10, 1), (11, 1), (12, 2), (13, 2), (14,
3), (15, 3);
INSERT 0 6
g2_master=# SELECT COUNT(*) FROM A JOIN B ON A.j = B.j;
count
-------
12
(1 row)

g2_master=# SELECT * FROM A JOIN B ON A.j = B.j;
i | j | k | j
---+---+----+---
1 | 1 | 10 | 1
1 | 1 | 11 | 1
2 | 1 | 10 | 1
2 | 1 | 11 | 1
3 | 2 | 12 | 2
3 | 2 | 13 | 2
4 | 2 | 12 | 2
4 | 2 | 13 | 2
5 | 3 | 14 | 3
5 | 3 | 15 | 3
6 | 3 | 14 | 3
6 | 3 | 15 | 3
(12 rows)

g2_master=#

On Sun, Jul 10, 2011 at 4:58 PM, Tim Uckun <tim(at)basediary(dot)com> wrote:

> I have two tables, traffic and sales. Each one has a date field and
> lists the traffic and sales broken down by various parameters
> (multiple rows for each date).
>
> If I run select (select count(*) from traffic) as traffic, (select
> count(*) from sales) as sales; I get the following 49383;167807
>
> if I run select count(*) from traffic t inner join sales s on t.date
> = s.date I get 24836841.
>
> If I change the join to a left join, right join, full join I get the
> same number of records.
>
> So I created a data table which just has the dates in it and ran this
> query.
>
> select count(d.date) from dates d
> inner join traffic t on t.date = d.date
> inner join sales s on s.date = d.date
>
> And I get the same number 24836841
>
> Same goes for right joins on the above query. Left joins of course
> give a different answer as there are more dates in the date table than
> there are in the other tables.
>
> I am a bit perplexed by what is happening here.
>
> Cheers
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
Rick Genter
rick(dot)genter(at)gmail(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Terry Lee Tucker 2011-07-12 15:25:24 Marking a Column for Special Use
Previous Message Prabhat Kumar 2011-07-12 14:24:14 Re: Schema for Website Comments