Re: Help with a JOIN.

From: Ken Benson <Ken(at)infowerks(dot)com>
To: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Help with a JOIN.
Date: 2014-07-23 14:11:03
Message-ID: eaf2846d9cd74132ae1e917ae89a7589@BY2PR02MB028.namprd02.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

From: Paul Linehan [mailto:linehanp(at)tcd(dot)ie]
Sent: Wednesday, July 23, 2014 2:47 AM
To: Ken Benson
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: [NOVICE] Help with a JOIN.

Hi all (again),

Ah... the joys of the shifting spec... I had an issue over the weekend and received assistance from
the list. I thought the problem was solved, but in fact there was a "modification to requirements" which
means that I'm back at square one!

I have two tables - ticket and comment (DDL and DML below) and I want to get the latest comment on
a ticket which hasn't been modified for some arbitrary period (in this example, 2 years (60M seconds)).

The result I want is this

ticket_id | ticket_description | c_max_date | comments_comment
-----------+--------------------+------------+-----------------------
3 | ticket 3 | 1089386130 | comment 4 on ticket 3
4 | ticket 4 | 1151680530 | comment 4 on ticket 4

but on running the query

SELECT t.ticket_id, t.ticket_description, x.c_max_date, z.comments_comment
FROM
(
SELECT ticket_id, MAX(c.comments_timestamp) AS c_max_date
FROM comment c
WHERE c.comments_timestamp < extract(epoch FROM TIMESTAMP '2014-07-20') - 60000000
GROUP BY ticket_id
)
AS x INNER JOIN ticket AS t ON t.ticket_id = x.ticket_id
INNER JOIN comment AS z ON z.comments_timestamp = x.c_max_date
[[KenB]] AND x.ticket_id=z.ticket_id
;

I'm getting this

ticket_id | ticket_description | c_max_date | comments_comment
-----------+--------------------+------------+-----------------------
3 | ticket 3 | 1089386130 | comment 3 on ticket 4 <<=== matches a timestamp for ticket 4
3 | ticket 3 | 1089386130 | comment 4 on ticket 3
4 | ticket 4 | 1151680530 | comment 4 on ticket 4

This is because when I run this query

select ticket_id, comment_id, comments_timestamp
from comment where ticket_id IN (3, 4)
order by ticket_id, comment_id, comments_timestamp;

I see the data has the following

ticket_id | comment_id | comments_timestamp
-----------+------------+--------------------
3 | 9 | 1007133330
3 | 10 | 1013613330
3 | 11 | 1088608530
3 | 12 | 1089386130 <<-- matches on ticket 4
4 | 13 | 1070205330
4 | 14 | 1076685330
4 | 15 | 1089386130 <<-- matches on ticket 3
4 | 16 | 1151680530

So my query is picking up the match between 3 (latest comments_timestamp) and a
timestamp on ticket 4 which is not the latest comments_timestamp for ticket 4.

What I would like is for my SQL to pick up the latest comments_timestamp by each ticket
and not the match between the latest comment on ticket 3 and one of the timestamps on
ticket 4.

I did think that I'd be able to use the comment_id field in the comment table, but as
it turns out, this is *_not_* indicative of the latest comment - comments may be modified
after they have been entered, so the largest value of comment_id for a given ticket
is not necessarily the last modification for that ticket.

Any assistance/ideas gratefully received.

Paul...

======== DDL and DML for my problem ====================

CREATE TABLE IF NOT EXISTS ticket
(
ticket_id INT NOT NULL,
ticket_description VARCHAR(25),
PRIMARY KEY (ticket_id)
);

CREATE TABLE IF NOT EXISTS comment
(
comment_id INT NOT NULL,
ticket_id INT NOT NULL,
comments_comment VARCHAR(45) NULL,
comments_timestamp INT NULL,
PRIMARY KEY (comment_id)
);

INSERT INTO ticket VALUES(1, 'ticket 1');
INSERT INTO ticket VALUES(2, 'ticket 2');
INSERT INTO ticket VALUES(3, 'ticket 3');
INSERT INTO ticket VALUES(4, 'ticket 4');
INSERT INTO ticket VALUES(5, 'ticket 5');

INSERT INTO comment VALUES(1, 1, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(2, 1, 'comment 2 on ticket 1', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(3, 1, 'comment 3 on ticket 1', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
INSERT INTO comment VALUES(4, 1, 'comment 4 on ticket 1', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));
INSERT INTO comment VALUES(5, 2, 'comment 1 on ticket 2', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(6, 2, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(7, 2, 'comment 3 on ticket 2', extract(epoch FROM TIMESTAMP '2014-07-30 15:15:30'));
INSERT INTO comment VALUES(8, 2, 'comment 4 on ticket 2', extract(epoch FROM TIMESTAMP '2014-08-09 15:15:30'));
INSERT INTO comment VALUES(9, 3, 'comment 1 on ticket 3', extract(epoch FROM TIMESTAMP '2001-11-30 15:15:30'));
INSERT INTO comment VALUES(10, 3, 'comment 2 on ticket 3', extract(epoch FROM TIMESTAMP '2002-02-13 15:15:30'));
INSERT INTO comment VALUES(11, 3, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2004-06-30 15:15:30'));
INSERT INTO comment VALUES(12, 3, 'comment 4 on ticket 3', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
INSERT INTO comment VALUES(13, 4, 'comment 1 on ticket 4', extract(epoch FROM TIMESTAMP '2003-11-30 15:15:30'));
INSERT INTO comment VALUES(14, 4, 'comment 2 on ticket 4', extract(epoch FROM TIMESTAMP '2004-02-13 15:15:30'));
INSERT INTO comment VALUES(15, 4, 'comment 3 on ticket 4', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
INSERT INTO comment VALUES(16, 4, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2006-06-30 15:15:30'));
INSERT INTO comment VALUES(17, 5, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(18, 5, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(19, 5, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
INSERT INTO comment VALUES(20, 5, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));

2014-07-20 17:51 GMT+01:00 Paul Linehan <linehanp(at)tcd(dot)ie<mailto:linehanp(at)tcd(dot)ie>>:

[[KenB]] You're asking for the MAX(comments_comment) …
<rest snipped>

Thanks for that Ken - just another small question for the list though.
Your query works perfectly for the data that I posted, however, I have data which has
duplicated dates (see DDL and DML below).

With this data, the result of your query

SELECT t.ticket_id, t.ticket_description, x.c_max_date, z.comments_comment
FROM
(
SELECT ticket_id, MAX(c.comments_timestamp) AS c_max_date
FROM comment c
WHERE c.comments_timestamp < extract(epoch FROM TIMESTAMP '2014-07-20') - 60000000
GROUP BY ticket_id
)
AS x INNER JOIN ticket AS t ON t.ticket_id = x.ticket_id
INNER JOIN comment AS z ON z.comments_timestamp = x.c_max_date;
is

ticket_id | ticket_description | c_max_date | comments_comment
-----------+--------------------+------------+-----------------------
3 | ticket 3 | 1089386130 | comment 3 on ticket 4 <<=== matches a timestamp for ticket 4
3 | ticket 3 | 1089386130 | comment 4 on ticket 3
4 | ticket 4 | 1151680530 | comment 4 on ticket 4

The result I want is

ticket_id | ticket_description | c_max_date | comments_comment
-----------+--------------------+------------+-----------------------
3 | ticket 3 | 1089386130 | comment 4 on ticket 3
4 | ticket 4 | 1151680530 | comment 4 on ticket 4

Running this query on the comments table
select ticket_id, comment_id, comments_timestamp
from comment where ticket_id IN (3, 4)
order by ticket_id, comment_id, comments_timestamp;

gives

ticket_id | comment_id | comments_timestamp
-----------+------------+--------------------
3 | 9 | 1007133330
3 | 10 | 1013613330
3 | 11 | 1088608530
3 | 12 | 1089386130 <<-- matches on ticket 4
4 | 13 | 1070205330
4 | 14 | 1076685330
4 | 15 | 1089386130 <<-- matches on ticket 3
4 | 16 | 1151680530

The problem is that the last date for ticket 3 matches a date (not the last)
for ticket 4.
Is there any way that I can pick up only the latest ticket in this case - you may
assume that comment_id is some sort of auto incrementing field - but I can't include
comment_id in my x subquery. I've been experimenting, but to no avail.

TIA and rgs,
Paul...

======== DDL and DML for my problem ====================

CREATE TABLE IF NOT EXISTS ticket
(
ticket_id INT NOT NULL,
ticket_description VARCHAR(25),
PRIMARY KEY (ticket_id)
);

CREATE TABLE IF NOT EXISTS comment
(
comment_id INT NOT NULL,
ticket_id INT NOT NULL,
comments_comment VARCHAR(45) NULL,
comments_timestamp INT NULL,
PRIMARY KEY (comment_id)
);

INSERT INTO ticket VALUES(1, 'ticket 1');
INSERT INTO ticket VALUES(2, 'ticket 2');
INSERT INTO ticket VALUES(3, 'ticket 3');
INSERT INTO ticket VALUES(4, 'ticket 4');
INSERT INTO ticket VALUES(5, 'ticket 5');

INSERT INTO comment VALUES(1, 1, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(2, 1, 'comment 2 on ticket 1', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(3, 1, 'comment 3 on ticket 1', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
INSERT INTO comment VALUES(4, 1, 'comment 4 on ticket 1', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));
INSERT INTO comment VALUES(5, 2, 'comment 1 on ticket 2', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(6, 2, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(7, 2, 'comment 3 on ticket 2', extract(epoch FROM TIMESTAMP '2014-07-30 15:15:30'));
INSERT INTO comment VALUES(8, 2, 'comment 4 on ticket 2', extract(epoch FROM TIMESTAMP '2014-08-09 15:15:30'));
INSERT INTO comment VALUES(9, 3, 'comment 1 on ticket 3', extract(epoch FROM TIMESTAMP '2001-11-30 15:15:30'));
INSERT INTO comment VALUES(10, 3, 'comment 2 on ticket 3', extract(epoch FROM TIMESTAMP '2002-02-13 15:15:30'));
INSERT INTO comment VALUES(11, 3, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2004-06-30 15:15:30'));
INSERT INTO comment VALUES(12, 3, 'comment 4 on ticket 3', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
INSERT INTO comment VALUES(13, 4, 'comment 1 on ticket 4', extract(epoch FROM TIMESTAMP '2003-11-30 15:15:30'));
INSERT INTO comment VALUES(14, 4, 'comment 2 on ticket 4', extract(epoch FROM TIMESTAMP '2004-02-13 15:15:30'));
INSERT INTO comment VALUES(15, 4, 'comment 3 on ticket 4', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
INSERT INTO comment VALUES(16, 4, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2006-06-30 15:15:30'));
INSERT INTO comment VALUES(17, 5, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(18, 5, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(19, 5, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
INSERT INTO comment VALUES(20, 5, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));

--

linehanp(at)tcd(dot)ie<mailto:linehanp(at)tcd(dot)ie>

Mob: 00 353 86 864 5772

--

linehanp(at)tcd(dot)ie<mailto:linehanp(at)tcd(dot)ie>

Mob: 00 353 86 864 5772

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message David G Johnston 2014-07-23 14:24:25 Re: Help with a JOIN.
Previous Message Paul Linehan 2014-07-23 09:47:20 Re: Help with a JOIN.