From: | Andreas Joseph Krogh <andreas(at)visena(dot)com> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Re: Problem with duplicate rows when FULL OUTER JOIN'ing 3 derived tables |
Date: | 2014-06-12 19:32:39 |
Message-ID: | VisenaEmail.20c.1d6f9efc3aa09035.14691911e5e@tc7-on |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
På torsdag 12. juni 2014 kl. 16:02:39, skrev David G Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com <mailto:david(dot)g(dot)johnston(at)gmail(dot)com>>: Note - numbers
do not relate to your questions. This pretty much answers 1
and 3.
1. I'd ensure that project name/id on a table can never be null by creating
a dummy project that means "none assigned".
2. I would avoid full join. To do so I'd cross join a distinct list of
companies with a distinct list of projects.
3. Against the join in 2 you then left join three times, once for each
product table.
4. In the final result any product not having a correspond company/project
would have its value coalesced to zero.
The end result is a table without any NULL and, in the case of your
explanatory data, 9 rows - two of which would be all zeros: (c3,p1) and
(c3,p,2)
If you cannot change the raw data I'd use CTE/WITH to normalize the data
according to 1 and the use these CTEs in the rest of the query. I would
also do 2 is a CTE then 3 would be normal subqueries or relation references
as necessary.
David J. Hi and thanks for input. I cannot change the data so I have to
deal with NULLs. Not all companies or projects are in the result, only the
ones involved having fruits, by combining the derived tables. I therefore don't
see how I effectively can build up a list of distinct companies and projects to
join with. I have a simplified version (full schema below) here which shows
that there's an error in the query (when one extra derived table is added to
the query): SELECT
coalesce(apl.company_name, bns.company_name, pns.company_name,
prs.company_name) AS company_name,
coalesce(apl.project_name, bns.project_name, pns.project_name,
prs.project_name) AS project_name,
apl.qty AS
num_apples,
bns.qty AS
num_bananas,
pns.qty AS
num_pineapples,
prs.qty AS
num_pears FROM company_apples AS apl FULL OUTER JOIN
company_bananas AS bns
ON apl.company_id = bns.company_id
AND (apl.project_id IS NULL AND bns.project_id IS NULL OR
apl.project_id = bns.project_id) FULL OUTER JOIN company_pineapples AS pns
ON apl.company_id = pns.company_id
AND (apl.project_id IS NULL AND pns.project_id IS NULL OR
apl.project_id = pns.project_id) FULL OUTER JOIN company_pears AS prs
ON apl.company_id = prs.company_id
AND (apl.project_id IS NULL AND prs.project_id IS NULL OR
apl.project_id = prs.project_id) ORDER BY company_name ASC, project_name ASC
NULLS FIRST; This gives the result: # company_name project_name num_apples
num_bananas num_pineapples num_pears 1 C1 NULL 2 2 10 NULL 2 C1 P1 5 12 NULL
NULL 3 C1 P2 2 NULL NULL NULL 4 C2 NULL 3 NULL 10 NULL 5 C2 P1 3 NULL NULL NULL
6 C2 P2 3 NULL NULL NULL 7 C3 NULL NULL 8 NULL NULL 8 C3 NULL NULL NULL NULL 7
As you see, there are two rows for C3, which should have been 1, with
num_bananas=8 and num_pears=7. There has to be something wrong with my FULL
OUTER JOINs but I don't know what.... Anyone knows how to write the correct
query for this? Thanks. Full schame and example-data: drop table if
exists company_apples;
drop table if exists company_bananas;
drop table if exists company_pineapples; CREATE TABLE company_apples
(
company_id INT,
company_name VARCHAR,
project_id INT,
project_name VARCHAR,
qty BIGINT
); CREATE TABLE company_bananas
(
company_id INT,
company_name VARCHAR,
project_id INT,
project_name VARCHAR,
qty BIGINT
); CREATE TABLE company_pineapples
(
company_id INT,
company_name VARCHAR,
project_id INT,
project_name VARCHAR,
qty BIGINT
); CREATE TABLE company_pears
(
company_id INT,
company_name VARCHAR,
project_id INT,
project_name VARCHAR,
qty BIGINT
);
INSERT INTO company_apples (company_id, company_name, project_id,
project_name, qty) VALUES (1, 'C1', null, null, 2);
INSERT INTO company_apples (company_id, company_name, project_id,
project_name, qty) VALUES (1, 'C1', 1, 'P1', 5);
INSERT INTO company_apples (company_id, company_name, project_id,
project_name, qty) VALUES (1, 'C1', 2, 'P2', 2);
INSERT INTO company_apples (company_id, company_name, project_id,
project_name, qty) VALUES (2, 'C2', null, null, 3);
INSERT INTO company_apples (company_id, company_name, project_id,
project_name, qty) VALUES (2, 'C2', 1, 'P1', 3);
INSERT INTO company_apples (company_id, company_name, project_id,
project_name, qty) VALUES (2, 'C2', 2, 'P2', 3); INSERT INTO company_bananas
(company_id, company_name, project_id, project_name, qty) VALUES (1, 'C1',
null, null, 2);
INSERT INTO company_bananas (company_id, company_name, project_id,
project_name, qty) VALUES (1, 'C1', 1, 'P1', 12);
INSERT INTO company_bananas (company_id, company_name, project_id,
project_name, qty) VALUES (3, 'C3', null, null, 8); INSERT INTO
company_pineapples (company_id, company_name, project_id, project_name, qty)
VALUES (1, 'C1', null, null, 10);
INSERT INTO company_pineapples (company_id, company_name, project_id,
project_name, qty) VALUES (2, 'C2', null, null, 10); INSERT INTO company_pears
(company_id, company_name, project_id, project_name, qty) VALUES (3, 'C3',
null, null, 7); select company_name, project_name, qty from company_apples
order by company_name ASC, project_name ASC nulls first;
select company_name, project_name, qty from company_bananas order by
company_name ASC, project_name ASC nulls first;
select company_name, project_name, qty from company_pineapples order by
company_name ASC, project_name ASC nulls first;
select company_name, project_name, qty from company_pears order by
company_name ASC, project_name ASC nulls first; -- Andreas Jospeh Krogh CTO /
Partner - Visena AS Mobile: +47 909 56 963 andreas(at)visena(dot)com
<mailto:andreas(at)visena(dot)com> www.visena.com <https://www.visena.com>
<https://www.visena.com>
From | Date | Subject | |
---|---|---|---|
Next Message | David G Johnston | 2014-06-12 19:49:59 | Re: Problem with duplicate rows when FULL OUTER JOIN'ing 3 derived tables |
Previous Message | David G Johnston | 2014-06-12 14:02:39 | Re: Problem with duplicate rows when FULL OUTER JOIN'ing 3 derived tables |