From: | <mkslaf(at)keemail(dot)me> |
---|---|
To: | Hellmuth Vargas <hivs77(at)gmail(dot)com> |
Cc: | Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Efficiently searching for the most recent rows where a column matches any result from a different query |
Date: | 2018-02-15 12:18:00 |
Message-ID: | L5OKvxB--3-0@keemail.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello Hellmuth,
Thank you for your response.
I've uploaded the query plan for the first query (user_id=2) here: https://gist.github.com/anonymous/6d251b277ef71f8977b03cab91fedccdThe query plan for the second query (user_id=1) can be found here: https://gist.github.com/anonymous/32ed485b40cce2651ddc52661f3e7f7b
Just like in the original queries, posts_user_id_id_index is not used.
Kind regards,Milo
13. Feb 2018 22:13 by hivs77(at)gmail(dot)com:
> Hello:
>
> EXPLAIN (ANALYZE, BUFFERS)> select * from (> > SELECT > posts.id> , > users.name> , posts.content> > FROM posts JOIN users ON posts.user_id = > users.id> > WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 1)
> > ORDER BY > posts.id> DESC > ) as a> ORDER BY > a.id> DESC > LIMIT 10;
> ------
>
> EXPLAIN (ANALYZE, BUFFERS)> select * from (> > SELECT > posts.id> , > users.name> , posts.content> > FROM posts JOIN users ON posts.user_id = > users.id> > WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 2)
> > ORDER BY > posts.id> DESC > ) as a> ORDER BY > a.id> DESC > LIMIT 10;
> 2018-02-13 8:28 GMT-05:00 <> mkslaf(at)keemail(dot)me> >:
>
>> >> Hello,
>> I have the following schema:
>> CREATE TABLE users (>> id BIGSERIAL PRIMARY KEY,>> name TEXT NOT NULL UNIQUE>> );>> >> CREATE TABLE friends (>> user_id BIGINT NOT NULL REFERENCES users,>> friend_user_id BIGINT NOT NULL REFERENCES users,>> UNIQUE (user_id, friend_user_id)>> );>> >> CREATE TABLE posts (>> id BIGSERIAL PRIMARY KEY,>> user_id BIGINT NOT NULL REFERENCES users,>> content TEXT NOT NULL>> );>> CREATE INDEX posts_user_id_id_index ON posts(user_id, id);
>> Each user can unilaterally follow any number of friends. The posts table has a large number of rows and is rapidly growing.
>> My goal is to retrieve the 10 most recent posts of a user's friends. This query gives the correct result, but is inefficient:
>> SELECT >> posts.id>> , >> users.name>> , posts.content>> FROM posts JOIN users ON posts.user_id = >> users.id>> WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 1)>> ORDER BY >> posts.id>> DESC LIMIT 10;
>> If the user's friends have recently posted, the query is still reasonably fast (>> https://explain.depesz.com/s/6ykR>> ). But if the user's friends haven't recently posted or the user has no friends, it quickly deteriorates (>> https://explain.depesz.com/s/OnoG>> ).
>> If I match only a single post author (e.g. WHERE posts.user_id = 5), Postgres uses the index posts_user_id_id_index. But if I use IN, the index doesn't appear to be used at all.
>> How can I get these results more efficiently?
>> I've uploaded the schema and the queries I've tried to dbfiddle at >> http://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=cf1489b7f6d53c3fe0b55ed7ccbad1f0>> . The output of "SELECT version()" is "PostgreSQL 9.6.5 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit" for me.
>> Thank you in advance for any insights, pointers or suggestions you are able to give me.
>> Regards,>> Milo>>
>
>
>
> --
> Cordialmente,
>
> Ing. Hellmuth I. Vargas S.
> Esp. Telemática y Negocios por Internet > Oracle Database 10g Administrator Certified Associate
> EnterpriseDB Certified PostgreSQL 9.3 Associate
>
From | Date | Subject | |
---|---|---|---|
Next Message | Mariel Cherkassky | 2018-02-18 15:19:08 | pgpool 2 rotate logs |
Previous Message | Mark Kirkwood | 2018-02-14 03:43:50 | Re: OT: Performance of VM |