Re: Help with a JOIN.

From: Paul Linehan <linehanp(at)tcd(dot)ie>
To: Ken Benson <Ken(at)infowerks(dot)com>
Cc: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Help with a JOIN.
Date: 2014-07-23 09:47:20
Message-ID: CAF4RT5Rb1Frv5nBbmGRw5SC1V9ciquKFyqv8Gu2_iU1V+ze8SQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hi all (again),

Ah... the joys of the shifting spec... I had an issue over the weekend and
received assistance from
the list. I thought the problem was solved, but in fact there was a
"modification to requirements" which
means that I'm back at square one!

I have two tables - ticket and comment (DDL and DML below) and I want to
get the latest comment on
a ticket which hasn't been modified for some arbitrary period (in this
example, 2 years (60M seconds)).

The result I want is this

ticket_id | ticket_description | c_max_date | comments_comment
-----------+--------------------+------------+-----------------------
3 | ticket 3 | 1089386130 | comment 4 on ticket 3
4 | ticket 4 | 1151680530 | comment 4 on ticket 4

but on running the query

SELECT t.ticket_id, t.ticket_description, x.c_max_date, z.comments_comment
FROM
(
SELECT ticket_id, MAX(c.comments_timestamp) AS c_max_date
FROM comment c
WHERE c.comments_timestamp < extract(epoch FROM TIMESTAMP '2014-07-20')
- 60000000
GROUP BY ticket_id
)
AS x INNER JOIN ticket AS t ON t.ticket_id = x.ticket_id
INNER JOIN comment AS z ON z.comments_timestamp = x.c_max_date;

I'm getting this

ticket_id | ticket_description | c_max_date | comments_comment
-----------+--------------------+------------+-----------------------
3 | ticket 3 | 1089386130 | comment 3 on ticket 4 <<===
matches a timestamp for ticket 4
3 | ticket 3 | 1089386130 | comment 4 on ticket 3
4 | ticket 4 | 1151680530 | comment 4 on ticket 4

This is because when I run this query

select ticket_id, comment_id, comments_timestamp
from comment where ticket_id IN (3, 4)
order by ticket_id, comment_id, comments_timestamp;

I see the data has the following

ticket_id | comment_id | comments_timestamp
-----------+------------+--------------------
3 | 9 | 1007133330
3 | 10 | 1013613330
3 | 11 | 1088608530
3 | 12 | 1089386130 <<-- matches on ticket 4
4 | 13 | 1070205330
4 | 14 | 1076685330
4 | 15 | 1089386130 <<-- matches on ticket 3
4 | 16 | 1151680530

So my query is picking up the match between 3 (latest comments_timestamp)
and a
timestamp on ticket 4 which is not the latest comments_timestamp for ticket
4.

What I would like is for my SQL to pick up the latest comments_timestamp by
each ticket
and not the match between the latest comment on ticket 3 and one of the
timestamps on
ticket 4.

I did think that I'd be able to use the comment_id field in the comment
table, but as
it turns out, this is *_not_* indicative of the latest comment - comments
may be modified
after they have been entered, so the largest value of comment_id for a
given ticket
is not necessarily the last modification for that ticket.

Any assistance/ideas gratefully received.

Paul...

======== DDL and DML for my problem ====================

CREATE TABLE IF NOT EXISTS ticket
(
ticket_id INT NOT NULL,
ticket_description VARCHAR(25),
PRIMARY KEY (ticket_id)
);

CREATE TABLE IF NOT EXISTS comment
(
comment_id INT NOT NULL,
ticket_id INT NOT NULL,
comments_comment VARCHAR(45) NULL,
comments_timestamp INT NULL,
PRIMARY KEY (comment_id)
);

INSERT INTO ticket VALUES(1, 'ticket 1');
INSERT INTO ticket VALUES(2, 'ticket 2');
INSERT INTO ticket VALUES(3, 'ticket 3');
INSERT INTO ticket VALUES(4, 'ticket 4');
INSERT INTO ticket VALUES(5, 'ticket 5');

INSERT INTO comment VALUES(1, 1, 'comment 1 on ticket 1', extract(epoch
FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(2, 1, 'comment 2 on ticket 1', extract(epoch
FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(3, 1, 'comment 3 on ticket 1', extract(epoch
FROM TIMESTAMP '2014-06-30 15:15:30'));
INSERT INTO comment VALUES(4, 1, 'comment 4 on ticket 1', extract(epoch
FROM TIMESTAMP '2014-07-09 15:15:30'));
INSERT INTO comment VALUES(5, 2, 'comment 1 on ticket 2', extract(epoch
FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(6, 2, 'comment 2 on ticket 2', extract(epoch
FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(7, 2, 'comment 3 on ticket 2', extract(epoch
FROM TIMESTAMP '2014-07-30 15:15:30'));
INSERT INTO comment VALUES(8, 2, 'comment 4 on ticket 2', extract(epoch
FROM TIMESTAMP '2014-08-09 15:15:30'));
INSERT INTO comment VALUES(9, 3, 'comment 1 on ticket 3', extract(epoch
FROM TIMESTAMP '2001-11-30 15:15:30'));
INSERT INTO comment VALUES(10, 3, 'comment 2 on ticket 3', extract(epoch
FROM TIMESTAMP '2002-02-13 15:15:30'));
INSERT INTO comment VALUES(11, 3, 'comment 3 on ticket 3', extract(epoch
FROM TIMESTAMP '2004-06-30 15:15:30'));
INSERT INTO comment VALUES(12, 3, 'comment 4 on ticket 3', extract(epoch
FROM TIMESTAMP '2004-07-09 15:15:30'));
INSERT INTO comment VALUES(13, 4, 'comment 1 on ticket 4', extract(epoch
FROM TIMESTAMP '2003-11-30 15:15:30'));
INSERT INTO comment VALUES(14, 4, 'comment 2 on ticket 4', extract(epoch
FROM TIMESTAMP '2004-02-13 15:15:30'));
INSERT INTO comment VALUES(15, 4, 'comment 3 on ticket 4', extract(epoch
FROM TIMESTAMP '2004-07-09 15:15:30'));
INSERT INTO comment VALUES(16, 4, 'comment 4 on ticket 4', extract(epoch
FROM TIMESTAMP '2006-06-30 15:15:30'));
INSERT INTO comment VALUES(17, 5, 'comment 1 on ticket 1', extract(epoch
FROM TIMESTAMP '2013-11-30 15:15:30'));
INSERT INTO comment VALUES(18, 5, 'comment 2 on ticket 2', extract(epoch
FROM TIMESTAMP '2014-02-13 15:15:30'));
INSERT INTO comment VALUES(19, 5, 'comment 3 on ticket 3', extract(epoch
FROM TIMESTAMP '2014-06-30 15:15:30'));
INSERT INTO comment VALUES(20, 5, 'comment 4 on ticket 4', extract(epoch
FROM TIMESTAMP '2014-07-09 15:15:30'));

2014-07-20 17:51 GMT+01:00 Paul Linehan <linehanp(at)tcd(dot)ie>:

>
> [[KenB]] You're asking for the MAX(comments_comment) …
>
> <rest snipped>
>
>
> Thanks for that Ken - just another small question for the list though.
>
> Your query works perfectly for the data that I posted, however, I have
> data which has
> duplicated dates (see DDL and DML below).
>
> With this data, the result of your query
>
> SELECT t.ticket_id, t.ticket_description, x.c_max_date, z.comments_comment
> FROM
> (
> SELECT ticket_id, MAX(c.comments_timestamp) AS c_max_date
> FROM comment c
> WHERE c.comments_timestamp < extract(epoch FROM TIMESTAMP '2014-07-20')
> - 60000000
> GROUP BY ticket_id
> )
> AS x INNER JOIN ticket AS t ON t.ticket_id = x.ticket_id
> INNER JOIN comment AS z ON z.comments_timestamp = x.c_max_date;
>
> is
>
> ticket_id | ticket_description | c_max_date | comments_comment
> -----------+--------------------+------------+-----------------------
> 3 | ticket 3 | 1089386130 | comment 3 on ticket 4
> <<=== matches a timestamp for ticket 4
> 3 | ticket 3 | 1089386130 | comment 4 on ticket 3
> 4 | ticket 4 | 1151680530 | comment 4 on ticket 4
>
>
> The result I want is
>
> ticket_id | ticket_description | c_max_date | comments_comment
> -----------+--------------------+------------+-----------------------
> 3 | ticket 3 | 1089386130 | comment 4 on ticket 3
> 4 | ticket 4 | 1151680530 | comment 4 on ticket 4
>
>
>
> Running this query on the comments table
> select ticket_id, comment_id, comments_timestamp
> from comment where ticket_id IN (3, 4)
> order by ticket_id, comment_id, comments_timestamp;
>
> gives
>
> ticket_id | comment_id | comments_timestamp
> -----------+------------+--------------------
> 3 | 9 | 1007133330
> 3 | 10 | 1013613330
> 3 | 11 | 1088608530
> 3 | 12 | 1089386130 <<-- matches on ticket 4
> 4 | 13 | 1070205330
> 4 | 14 | 1076685330
> 4 | 15 | 1089386130 <<-- matches on ticket 3
> 4 | 16 | 1151680530
>
>
> The problem is that the last date for ticket 3 matches a date (not the
> last)
> for ticket 4.
>
> Is there any way that I can pick up only the latest ticket in this case -
> you may
> assume that comment_id is some sort of auto incrementing field - but I
> can't include
> comment_id in my x subquery. I've been experimenting, but to no avail.
>
> TIA and rgs,
>
> Paul...
>
>
> ======== DDL and DML for my problem ====================
>
> CREATE TABLE IF NOT EXISTS ticket
> (
> ticket_id INT NOT NULL,
> ticket_description VARCHAR(25),
> PRIMARY KEY (ticket_id)
> );
>
>
> CREATE TABLE IF NOT EXISTS comment
> (
> comment_id INT NOT NULL,
> ticket_id INT NOT NULL,
> comments_comment VARCHAR(45) NULL,
> comments_timestamp INT NULL,
> PRIMARY KEY (comment_id)
> );
>
> INSERT INTO ticket VALUES(1, 'ticket 1');
> INSERT INTO ticket VALUES(2, 'ticket 2');
> INSERT INTO ticket VALUES(3, 'ticket 3');
> INSERT INTO ticket VALUES(4, 'ticket 4');
> INSERT INTO ticket VALUES(5, 'ticket 5');
>
> INSERT INTO comment VALUES(1, 1, 'comment 1 on ticket 1', extract(epoch
> FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(2, 1, 'comment 2 on ticket 1', extract(epoch
> FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(3, 1, 'comment 3 on ticket 1', extract(epoch
> FROM TIMESTAMP '2014-06-30 15:15:30'));
> INSERT INTO comment VALUES(4, 1, 'comment 4 on ticket 1', extract(epoch
> FROM TIMESTAMP '2014-07-09 15:15:30'));
> INSERT INTO comment VALUES(5, 2, 'comment 1 on ticket 2', extract(epoch
> FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(6, 2, 'comment 2 on ticket 2', extract(epoch
> FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(7, 2, 'comment 3 on ticket 2', extract(epoch
> FROM TIMESTAMP '2014-07-30 15:15:30'));
> INSERT INTO comment VALUES(8, 2, 'comment 4 on ticket 2', extract(epoch
> FROM TIMESTAMP '2014-08-09 15:15:30'));
> INSERT INTO comment VALUES(9, 3, 'comment 1 on ticket 3', extract(epoch
> FROM TIMESTAMP '2001-11-30 15:15:30'));
> INSERT INTO comment VALUES(10, 3, 'comment 2 on ticket 3', extract(epoch
> FROM TIMESTAMP '2002-02-13 15:15:30'));
> INSERT INTO comment VALUES(11, 3, 'comment 3 on ticket 3', extract(epoch
> FROM TIMESTAMP '2004-06-30 15:15:30'));
> INSERT INTO comment VALUES(12, 3, 'comment 4 on ticket 3', extract(epoch
> FROM TIMESTAMP '2004-07-09 15:15:30'));
> INSERT INTO comment VALUES(13, 4, 'comment 1 on ticket 4', extract(epoch
> FROM TIMESTAMP '2003-11-30 15:15:30'));
> INSERT INTO comment VALUES(14, 4, 'comment 2 on ticket 4', extract(epoch
> FROM TIMESTAMP '2004-02-13 15:15:30'));
> INSERT INTO comment VALUES(15, 4, 'comment 3 on ticket 4', extract(epoch
> FROM TIMESTAMP '2004-07-09 15:15:30'));
> INSERT INTO comment VALUES(16, 4, 'comment 4 on ticket 4', extract(epoch
> FROM TIMESTAMP '2006-06-30 15:15:30'));
> INSERT INTO comment VALUES(17, 5, 'comment 1 on ticket 1', extract(epoch
> FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(18, 5, 'comment 2 on ticket 2', extract(epoch
> FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(19, 5, 'comment 3 on ticket 3', extract(epoch
> FROM TIMESTAMP '2014-06-30 15:15:30'));
> INSERT INTO comment VALUES(20, 5, 'comment 4 on ticket 4', extract(epoch
> FROM TIMESTAMP '2014-07-09 15:15:30'));
>
>
> --
>
> linehanp(at)tcd(dot)ie
>
> Mob: 00 353 86 864 5772
>
>
>

--

linehanp(at)tcd(dot)ie

Mob: 00 353 86 864 5772

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Ken Benson 2014-07-23 14:11:03 Re: Help with a JOIN.
Previous Message Albe Laurenz 2014-07-22 10:18:19 Re: Creating a hot copy of PostgreSQL database