Re: How to get non-existant values as NULL or 0?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Pól Ua Laoínecháin <linehanp(at)tcd(dot)ie>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to get non-existant values as NULL or 0?
Date: 2019-06-29 22:33:45
Message-ID: 1775a838-4e6e-a0b7-9e5e-9849903f6ec4@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 6/29/19 9:20 AM, Pól Ua Laoínecháin wrote:
> I'm in a bit of a quandary. I'm trying to figure out how to have
> non-existent values appear as NULL (or using COALESCE, as 0).
>
> I have several tables (DDL and DML shown at the bottom of this post -
> and there is a fiddle available here:
> https://dbfiddle.uk/?rdbms=postgres_11&fiddle=171d207d25981e0db15791e1684de802

As a start added to above fiddle:

WITH dist AS
(SELECT distinct on(job_id, loc_key) job_id, loc_key FROM
the_location, job )
SELECT
loc_key, job_id, COALESCE(lj_percent, 0) AS percent
FROM
dist
LEFT JOIN
location_job AS lj
ON
dist.job_id = lj.lj_job
AND
dist.loc_key = lj.lj_loc
ORDER BY
loc_key, job_id;

loc_key | job_id | percent
---------+--------+---------
1 | 1 | 0
1 | 2 | 10
1 | 3 | 10
1 | 4 | 10
1 | 5 | 10
1 | 6 | 50
2 | 1 | 30
2 | 2 | 30
2 | 3 | 10
2 | 4 | 10
2 | 5 | 10
2 | 6 | 10

>
> The crux of the problem is that there is a joining table location_job
>
> CREATE TABLE location_job
> (
> lj_loc INTEGER NOT NULL,
> lj_job INTEGER NOT NULL,
> lj_percent INTEGER DEFAULT 0 NOT NULL,
> &c... - see fiddle or below
> );
>
> with entries as follows:
>
> INSERT INTO location_job
> VALUES
> (1, 2, 10), (1, 3, 10), (1, 4, 10), (1, 5, 10), (1, 6, 50), -- missing
> value (1, 1, 10)
> (2, 1, 30), (2, 2, 30), (2, 3, 10), (2, 4, 10), (2, 5, 10), (2, 6, 10);
>
> Now the lj_job from the missing value corresponds to a job_type of Unknown.
>
> So, performing this query:
>
> SELECT
> tl.loc_key, tl.loc_coords,
> lj.lj_loc, lj.lj_job, lj.lj_percent,
> j.job_id, j.job_type
> FROM the_location tl
> JOIN location_job lj ON tl.loc_key = lj.lj_loc
> JOIN job j ON lj.lj_job = j.job_id
> ORDER BY tl.loc_key, j.job_id;
>
> Gives me the results:
>
> loc_keyloc_coordslj_loclj_joblj_percentjob_idjob_type
> 1coord_112102Unemployed
> 1coord_113103Blue collar
> 1coord_114104White collar
>
> Notice that (as expected) there is no result for job_type Unknown for
> location 1 - it's the missing record - all fine and dandy!
>
> However, what I want to do is to include a record in my resultset that
> has NULL (or 0) for job_type Unknown.
>
> You will be able to see from the fiddle that I have tried varying the
> order of the joins and the join types (LEFT OUTER, RIGHT OUTER, FULL
> OUTER), all to no avail.
>
> My question boils down to how do I get the first line of my resultset to be
>
> 1 coord_1 1 1 0 1 Unknown -- the 0 corresponds to the percent of
> Unknowns in the location coord_1.
>
> I would be grateful for an explanation of any answer as I'm trying to
> learn SQL and this really has me stumped.
>
> Should you require any further information, please don't hesitate to
> let me know,
>
> TIA,
>
> Pól...
>
>
> =========== DDL and DML ===============
>
> CREATE TABLE crime
> (
> cr_rating VARCHAR (25) PRIMARY KEY
> );
>
> INSERT INTO crime VALUES ('Low'), ('Medium'), ('High');
>
> CREATE TABLE weather
> (
> weather_type VARCHAR (50) PRIMARY KEY
> );
>
> INSERT INTO weather VALUES ('Scorching'), ('Sunny'), ('Mild'),
> ('Overcast'), ('Drizzle'), ('Rainy'), ('Stormy'), ('Hurricane');
>
> CREATE TABLE job (job_id SERIAL PRIMARY KEY, job_type VARCHAR (30) NOT NULL);
>
> INSERT INTO job (job_type) VALUES ('Unknown'), ('Unemployed'), ('Blue
> collar'), ('White collar'), ('Manager'), ('Self-employed');
>
> CREATE TABLE the_location
> (
>
> loc_key SERIAL PRIMARY KEY,
> loc_coords VARCHAR (50) NOT NULL, -- not sure how you store these in
> your system
> loc_status INTEGER CHECK (loc_status IN (0,1)),
> loc_rating INTEGER CHECK (loc_rating BETWEEN 1 AND 10),
> loc_crime VARCHAR (25) NOT NULL,
> -- could use a CHECK CONSTRAINT, but use a lookup table instead
> -- note also that the types should match exactly
> loc_weather VARCHAR (50) NOT NULL,
> -- location_rating_tourism, -- same lookup table idea here (Good,
> Medium, Poor, Death-zone)
> -- location_rating_income, -- list of social classes table
> -- location_rating_jobs
> CONSTRAINT ct_loc_crime FOREIGN KEY (loc_crime) REFERENCES crime (cr_rating),
> CONSTRAINT ct_loc_weather FOREIGN KEY (loc_weather) REFERENCES
> weather (weather_type)
> );
>
> INSERT INTO the_location
> (loc_coords, loc_status, loc_rating, loc_crime, loc_weather)
> VALUES
> ('coord_1', 0, 7, 'Medium', 'Rainy'),
> ('coord_2', 1, 7, 'High', 'Mild');
>
> CREATE TABLE location_job
> (
> lj_loc INTEGER NOT NULL,
> lj_job INTEGER NOT NULL,
> lj_percent INTEGER DEFAULT 0 NOT NULL,
> PRIMARY KEY (lj_loc, lj_job),
> CONSTRAINT ct_lj_loc FOREIGN KEY (lj_loc) REFERENCES the_location (loc_key),
> CONSTRAINT ct_lj_job FOREIGN KEY (lj_job) REFERENCES job(job_id)
> );
>
> INSERT INTO location_job
> VALUES
> (1, 2, 10), (1, 3, 10), (1, 4, 10), (1, 5, 10), (1, 6, 50), -- missing
> value (1, 1, 10)
> (2, 1, 30), (2, 2, 30), (2, 3, 10), (2, 4, 10), (2, 5, 10), (2, 6, 10);
>
> -- and my first query - see fiddle for others
>
> SELECT
> tl.loc_key, tl.loc_coords,
> lj.lj_loc, lj.lj_job, lj.lj_percent,
> j.job_id, j.job_type
> FROM the_location tl
> JOIN location_job lj ON tl.loc_key = lj.lj_loc
> JOIN job j ON lj.lj_job = j.job_id
> ORDER BY tl.loc_key, j.job_id;
>
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2019-06-29 22:40:32 Re: Azure Database for PostgreSQL - how to add pgbouncer or similar?
Previous Message Pól Ua Laoínecháin 2019-06-29 16:48:18 Re: How to get non-existant values as NULL or 0?