Re: A 3 table join question

From: Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com>
To: stan <stanb(at)panix(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: A 3 table join question
Date: 2019-08-17 01:18:03
Message-ID: CAD3a31V3U9LWhGUFU+DZUGEHW4Z4c4n1y84mcwTsUjL=KFMaag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Aug 16, 2019 at 5:54 PM stan <stanb(at)panix(dot)com> wrote:

>
> On Fri, Aug 16, 2019 at 12:30:33PM -0700, Ken Tanzer wrote:
> > On Fri, Aug 16, 2019 at 7:24 AM rob stone <floriparob(at)gmail(dot)com> wrote:
> >
> > > Hello,
> > >
> > > On Fri, 2019-08-16 at 07:39 -0400, stan wrote:
> > > > What am I doing wrong here?
> > > >
> > >
> > >
> > > Your view assumes that all three "streams" contain all the proj_no's
> > > whereas your test data for expense_report_cost_sum_view has no proj_no
> > > = 764.
> > >
> > >
> > Hi. I'm probably missing something, but it seems simpler to either join
> > with USING, or by COALESCEing the two ID fields in left part of the JOIN
> > clause (COALESCE(t1.proj_no,t2.proj_no)=t3.proj_no).
> >
> > Cheers,
> > Ken
> >
> > CREATE TEMP TABLE t1 (id int, t1_val TEXT);
> > INSERT INTO t1 VALUES (2,'T1_2');
> > INSERT INTO t1 VALUES (5,'T1_5');
> > INSERT INTO t1 VALUES (7,'T1_7');
> > INSERT INTO t1 VALUES (10,'T1_10');
> >
> > CREATE TEMP TABLE t2 (id int, t2_val TEXT);
> > INSERT INTO t2 VALUES (3,'T2_3');
> > INSERT INTO t2 VALUES (5,'T2_5');
> > INSERT INTO t2 VALUES (6,'T2_6');
> > INSERT INTO t2 VALUES (10,'T2_10');
> >
> > CREATE TEMP TABLE t3 (id int, t3_val TEXT);
> > INSERT INTO t3 VALUES (4,'T3_4');
> > INSERT INTO t3 VALUES (6,'T3_6');
> > INSERT INTO t3 VALUES (7,'T3_7');
> > INSERT INTO t3 VALUES (10,'T3_10');
> >
> > SELECT id,t1_val,t2_val,t3_val
> > FROM
> > t1
> > FULL JOIN t2 USING (id)
> > FULL JOIN t3 USING (id)
> > ;
> >
> > SELECT COALESCE(t1.id,t2.id,t3.id) AS id,t1_val,t2_val,t3_val
> > FROM
> > t1
> > FULL JOIN t2 ON (t1.id=t2.id)
> > FULL JOIN t3 ON (COALESCE(t1.id,t2.id)=t3.id)
> > ;
> >
> > id | t1_val | t2_val | t3_val
> > ----+--------+--------+--------
> > 2 | T1_2 | |
> > 3 | | T2_3 |
> > 4 | | | T3_4
> > 5 | T1_5 | T2_5 |
> > 6 | | T2_6 | T3_6
> > 7 | T1_7 | | T3_7
> > 10 | T1_10 | T2_10 | T3_10
> > (7 rows)
> >
> OK, I am clearly not understanding this yet.
>
> Here is what I am trying:
>
>
> select
> COALESCE(
> labor_cost_sum_view.proj_no ,
> material_cost_sum_view.proj_no ,
> expense_report_cost_sum_view.proj_no
> )
> AS
> proj_no ,
> labor_cost_sum_view.labor_cost_sum ,
> material_cost_sum_view.mtrl_cost ,
> expense_report_cost_sum_view.burdened_cost ,
> coalesce( labor_cost_sum_view.labor_cost_sum, 0)
> +
> coalesce(material_cost_sum_view.mtrl_cost, 0)
> +
> coalesce(expense_report_cost_sum_view.burdened_cost, 0)
> from
> labor_cost_sum_view
> full join material_cost_sum_view on
> material_cost_sum_view.proj_no = labor_cost_sum_view.proj_no
> full join expense_report_cost_sum_view on
> expense_report_cost_sum_view.proj_no = labor_cost_sum_view.proj_no
> ;
>
>
> Having simplified things a bunch, and removing casts and aliases etc.
>
> But here is what I am getting as a result set:
>
> proj_no | labor_cost_sum | mtrl_cost | burdened_cost | ?column?
> ---------+----------------+-----------+---------------+----------------
> 45 | 3133.17500000 | 5394.6800 | 2564.980000 | 11092.83500000
> 764 | 810.75000000 | 7249.4800 | | 8060.23000000
> 789 | 7015.57500000 | | | 7015.57500000
> 7456 | | 4007.3000 | | 4007.3000
> 33 | | | 241.380000 | 241.380000
> 7456 | | | 1747.110000 | 1747.110000
> (6 row
>
> Note that project number 7456 appears in 2 rows.
>
> --

If you look at 7456, it has data in your 2nd and 3rd tables (material &
expense/burdened), but not the 1st (labor). So the first two tables (labor
& material) get joined for the first row you see there. Then you are
joining the 3rd table (expense) to this join on:

expense_report_cost_sum_view.proj_no = labor_cost_sum_view.proj_no

But there is no labor_cost_sum_view.proj_no for 7456. So the join doesn't
match, and you're getting a separate row for the expense column.

In the example I gave, the 1st two IDs were coalesced:

FULL JOIN t3 ON *(COALESCE(t1.id <http://t1.id/>,t2.id
<http://t2.id/>)=t3.id <http://t3.id/>)*

Which in your case would translate to:

full join expense_report_cost_sum_view on
(expense_report_cost_sum_view.proj_no =
*COALESCE(labor_cost_sum_view.proj_no,material_cost_sum_view.proj_no))*

For 7456, the coalesce will yield the 7456 that would then join to your
expense row.

Also, FWIW, since your proj_no that you are matching on is the same in all
tables, you can join with USING instead. Aside from being a little simpler
to write out, you also end up with only one proj_no column instead of 3,
and avoids the need to coalesce them as you are doing in the select.

Cheers,
Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client>*
ken(dot)tanzer(at)agency-software(dot)org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request(at)lists(dot)sourceforge(dot)net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message stan 2019-08-17 15:45:02 A user atribute question
Previous Message stan 2019-08-17 00:54:31 Re: A 3 table join question