Re: Inappropriate inner table for nested loop join

From: Akihiko Odaki <akihiko(dot)odaki(dot)4i(at)stu(dot)hosei(dot)ac(dot)jp>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Inappropriate inner table for nested loop join
Date: 2017-06-23 11:54:39
Message-ID: 81a17ec0-4556-8d41-e043-777433d8ba10@stu.hosei.ac.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2017-06-23 20:20, Albe Laurenz wrote:
> You could either try to do something like
>
> SELECT *
> FROM (SELECT "posts".*
> FROM "posts"
> JOIN "follows" ON "follows"."target_account" =
"posts"."account"
> WHERE "follows"."owner_account" = $1
> OFFSET 0) q
> ORDER BY "posts"."timestamp"
> LIMIT 100;

Now I wonder whether it actually sorted or not. As you said, I want to
"find rows with the greatest 'timestamp', match with rows from 'posts'
in a nested loop and stop as soon as it has found 100 matches".

However, it seems to query 100 records without any consideration for
"timestamp", and then sorts them. That is not expected. Here is a
abstract query plan:

Limit
-> Sort
Sort Key: posts.id DESC
-> Nested Loop
-> Seq Scan on follows
Filter: (owner_account = $1)
-> Index Scan using index_posts_on_account on posts
Index Cond: (account_id = follows.target_account)

index_posts_on_account is an obsolete index on "posts" and only for
"account". So it does nothing for sorting "timestamp".

Regards,
Akihiko Odaki

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Adam Brusselback 2017-06-23 12:09:49 Re: Dataset is fetched from cache but still takes same time to fetch records as first run
Previous Message Akihiko Odaki 2017-06-23 11:36:41 Re: Inappropriate inner table for nested loop join