Re: Problem with duplicate rows when FULL OUTER JOIN'ing 3 derived tables

From: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Problem with duplicate rows when FULL OUTER JOIN'ing 3 derived tables
Date: 2014-06-12 19:52:22
Message-ID: CAKFQuwatcS14kwuqTL=TtCwUQiVdSpzP4pLW0tP+L2yFSUtQMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Thu, Jun 12, 2014 at 3:49 PM, David Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
wrote:

>
> SELECT
>> coalesce(apl.company_name, bns.company_name, pns.company_name,
>> prs.company_name) AS company_name,
>> coalesce(apl.project_name, bns.project_name, pns.project_name,
>> prs.project_name) AS project_name,
>> apl.qty AS
>> num_apples,
>> bns.qty AS
>> num_bananas,
>> pns.qty AS
>> num_pineapples,
>> prs.qty AS
>> num_pears
>> FROM
>>
>> company_apples AS apl
>>
>> FULL OUTER JOIN company_bananas AS bns
>> ON apl.company_id = bns.company_id
>> AND (apl.project_id IS NULL AND bns.project_id IS NULL OR
>> apl.project_id = bns.project_id)
>>
>> FULL OUTER JOIN company_pineapples AS pns
>> ON apl.company_id = pns.company_id
>> AND (apl.project_id IS NULL AND pns.project_id IS NULL OR
>> apl.project_id = pns.project_id)
>>
>> FULL OUTER JOIN company_pears AS prs
>> ON apl.company_id = prs.company_id
>> AND (apl.project_id IS NULL AND prs.project_id IS NULL OR
>> apl.project_id = prs.project_id)
>>
>> ORDER BY company_name ASC, project_name ASC NULLS FIRST;
>>
>>
>> This gives the result:
>>
>> *#* *company_name* *project_name* *num_apples* *num_bananas*
>> *num_pineapples* *num_pears* 1 C1 NULL 2 2 10 NULL 2 C1 P1 5 12 NULL
>> NULL 3 C1 P2 2 NULL NULL NULL 4 C2 NULL 3 NULL 10 NULL 5 C2 P1 3 NULL
>> NULL NULL 6 C2 P2 3 NULL NULL NULL 7 C3 NULL NULL 8 NULL NULL 8 C3
>> NULL NULL NULL NULL 7
>> As you see, there are two rows for C3, which should have been 1, with
>> num_bananas=8 and num_pears=7. There has to be something wrong with my FULL
>> OUTER JOINs but I don't know what....
>>
>>
>>
> ​Row #7 is from the right side of the outer join between apples and bananas
> Row #8 is from the right side of the outer join between apples and pears
>
> Since you are only using apple as the source of valid company-project
> pairs only those records will be "correct" per your definition.
>
> Even if you do not use a CROSS JOIN between company/product you have to
> obtain a master list of ​valid company-project pairs from ALL of the target
> tables. You can then LEFT JOIN that master against each of the target
> tables and be sure that you have a valid master record to attach to.
>
> The direct way to do this is:
>
> SELECT DISTINCT company_id, project_id FROM apples
> UNION DISTINCT
> SELECT DISTINCT company_id, project_id FROM pears
> [and so forth]
>
> David J.
>
>
​Though I guess you could also simply chain together the FULL OUTER join:

FROM ( ( (apple OUTER bananas)​ AS ab OUTER pears ) AS abp OUTER pineapples
) AS abpp

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-duplicate-rows-when-FULL-OUTER-JOIN-ing-3-derived-tables-tp5806970p5807077.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Andreas Joseph Krogh 2014-06-12 20:04:32 Re: Re: Problem with duplicate rows when FULL OUTER JOIN'ing 3 derived tables
Previous Message David G Johnston 2014-06-12 19:49:59 Re: Problem with duplicate rows when FULL OUTER JOIN'ing 3 derived tables