Re: Efficiently searching for the most recent rows where a column matches any result from a different query

From: <mkslaf(at)keemail(dot)me>
To: Hellmuth Vargas <hivs77(at)gmail(dot)com>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Efficiently searching for the most recent rows where a column matches any result from a different query
Date: 2018-02-15 12:18:00
Message-ID: L5OKvxB--3-0@keemail.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Hello Hellmuth,

Thank you for your response.
I've uploaded the query plan for the first query (user_id=2) here: https://gist.github.com/anonymous/6d251b277ef71f8977b03cab91fedccdThe query plan for the second query (user_id=1) can be found here: https://gist.github.com/anonymous/32ed485b40cce2651ddc52661f3e7f7b
Just like in the original queries, posts_user_id_id_index is not used.
Kind regards,Milo
13. Feb 2018 22:13 by hivs77(at)gmail(dot)com:

> Hello:
>
> EXPLAIN (ANALYZE, BUFFERS)> select * from (> > SELECT > posts.id> , > users.name> , posts.content> > FROM posts JOIN users ON posts.user_id = > users.id> > WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 1)
> > ORDER BY > posts.id> DESC > ) as a> ORDER BY > a.id> DESC > LIMIT 10;
> ------
>
> EXPLAIN (ANALYZE, BUFFERS)> select * from (> > SELECT > posts.id> , > users.name> , posts.content> > FROM posts JOIN users ON posts.user_id = > users.id> > WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 2)
> > ORDER BY > posts.id> DESC > ) as a> ORDER BY > a.id> DESC > LIMIT 10;
> 2018-02-13 8:28 GMT-05:00 <> mkslaf(at)keemail(dot)me> >:
>
>> >> Hello,
>> I have the following schema:
>>     CREATE TABLE users (>>         id   BIGSERIAL PRIMARY KEY,>>         name TEXT      NOT NULL UNIQUE>>     );>>     >>     CREATE TABLE friends (>>         user_id        BIGINT NOT NULL REFERENCES users,>>         friend_user_id BIGINT NOT NULL REFERENCES users,>>         UNIQUE (user_id, friend_user_id)>>     );>>     >>     CREATE TABLE posts (>>         id      BIGSERIAL PRIMARY KEY,>>         user_id BIGINT    NOT NULL REFERENCES users,>>         content TEXT      NOT NULL>>     );>>     CREATE INDEX posts_user_id_id_index ON posts(user_id, id);
>> Each user can unilaterally follow any number of friends. The posts table has a large number of rows and is rapidly growing.
>> My goal is to retrieve the 10 most recent posts of a user's friends. This query gives the correct result, but is inefficient:
>>     SELECT >> posts.id>> , >> users.name>> , posts.content>>     FROM posts JOIN users ON posts.user_id = >> users.id>>     WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 1)>>     ORDER BY >> posts.id>> DESC LIMIT 10;
>> If the user's friends have recently posted, the query is still reasonably fast (>> https://explain.depesz.com/s/6ykR>> ). But if the user's friends haven't recently posted or the user has no friends, it quickly deteriorates (>> https://explain.depesz.com/s/OnoG>> ).
>> If I match only a single post author (e.g. WHERE posts.user_id = 5), Postgres uses the index posts_user_id_id_index. But if I use IN, the index doesn't appear to be used at all.
>> How can I get these results more efficiently?
>> I've uploaded the schema and the queries I've tried to dbfiddle at >> http://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=cf1489b7f6d53c3fe0b55ed7ccbad1f0>> . The output of "SELECT version()" is "PostgreSQL 9.6.5 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit" for me.
>> Thank you in advance for any insights, pointers or suggestions you are able to give me.
>> Regards,>> Milo>>
>
>
>
> --
> Cordialmente,
>
> Ing. Hellmuth I. Vargas S.
> Esp. Telemática y Negocios por Internet > Oracle Database 10g Administrator Certified Associate
> EnterpriseDB Certified PostgreSQL 9.3 Associate
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mariel Cherkassky 2018-02-18 15:19:08 pgpool 2 rotate logs
Previous Message Mark Kirkwood 2018-02-14 03:43:50 Re: OT: Performance of VM