Re: newsfeed type query

From: Jonathan Vanasco <postgres(at)2xlp(dot)com>
To: Melvin Davidson <melvin6925(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: newsfeed type query
Date: 2015-04-28 23:57:47
Message-ID: DFC5C7D3-3044-4BB4-BD49-F6EDD75D6BA1@2xlp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Sorry, I was trying to ask something very abstract as I have similar situations on multiple groups of queries/tables (and they're all much more complex).

I'm on pg 9.3

The relevant structure is:

posting:
id
timestamp_publish
group_id__in
user_id__author

friends:
user_id__a
user_id__b

memberships:
user_id
group_id
role_id

-- working sql
CREATE TABLE groups(
id SERIAL NOT NULL PRIMARY KEY
);
CREATE TABLE users(
id SERIAL NOT NULL PRIMARY KEY
);
CREATE TABLE friends (
user_id__a INT NOT NULL REFERENCES users( id ),
user_id__b INT NOT NULL REFERENCES users( id )
);
CREATE TABLE memberships (
user_id INT NOT NULL REFERENCES users( id ),
group_id INT NOT NULL REFERENCES groups( id ),
role_id INT NOT NULL
);
CREATE TABLE posting (
id SERIAL NOT NULL,
timestamp_publish timestamp not null,
group_id__in INT NOT NULL REFERENCES groups(id),
user_id__author INT NOT NULL REFERENCES users(id),
is_published BOOL
);

The output that I'm trying to get is:
posting.id
{the context of the select}
posting.timestamp_publish (this may need to get correlated into other queries)

These approaches had bad performance:

-- huge selects / memory
-- it needs to load everything from 2 tables before it limits
EXPLAIN ANALYZE
SELECT id, feed_context FROM (
SELECT id, timestamp_publish, 'in-group' AS feed_context FROM posting
WHERE (
group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57 AND role_id IN (1,2,3))
AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC')
)
UNION
SELECT id, timestamp_publish, 'by-user' AS feed_context FROM posting
WHERE (
user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a = 57)
AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC')
)
) AS feed
ORDER BY timestamp_publish DESC
LIMIT 10
;

-- selects minimized, but repetitive subqueries
SELECT
id,
CASE
WHEN group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57 AND role_id IN (1,2,3)) THEN True
ELSE NULL
END AS feed_context_group,
CASE
WHEN user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a = 57) THEN True
ELSE NULL
END AS feed_context_user
FROM posting
WHERE (
group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57 AND role_id IN (1,2,3))
OR
user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a = 57)
)
AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC')
ORDER BY timestamp_publish DESC
LIMIT 10
;

On Apr 28, 2015, at 6:56 PM, Melvin Davidson wrote:

> Since you very nicely DID NOT provide the pg version, O/S or table structure(s), which is what you should do REGARDLESS of the
> type of question (it's just the smart and polite thing to do when asking for help) The best I can suggest is:
> SELECT
> CASE WHEN context = 'friend' THEN p.junka
> WHEN context = 'group' THEN p.junkb
> WHEN context = 'both' THEN p.junka || ' ' || p.junkb
> END
> FROM posting p
> where p.author_id in (SELECT f.friend_id
> FROM friends f
> WHERE f.user_id = ?)
> OR p.group_id in (SELECT m.group_id
> FROM memberships m
> WHERE m.user_id = ?);

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jim Nasby 2015-04-29 01:14:32 Re: BDR Selective Replication
Previous Message Melvin Davidson 2015-04-28 22:56:26 Re: newsfeed type query