Queries with Joins before filtering taking too much time! Filter (where clause) *first* -- suggestions ?

From: Hursh Jain <hurshj(at)gmail(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Queries with Joins before filtering taking too much time! Filter (where clause) *first* -- suggestions ?
Date: 2017-01-15 23:46:09
Message-ID: 587C09C1.20508@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hi:

Postgres 9.6.1 on a large linux box.

I have 2 tables under consideration:

1. property (about 800,000 rows)
This has a attribute called current_owner which is FK pointing to uid in
users table. This attribute can be null.

2. users (about 400 rows)
This has users in the system, each user has a uid.

=====
Query:

SELECT
*
FROM
property p
LEFT OUTER JOIN users ON (p.current_owner = users.uid)
WHERE
p.pid in
(SELECT pid FROM reward WHERE reward_type = 'daily'
ORDER BY reward_date DESC LIMIT 30)
;

This query takes a looong time. I tried an EXPLAIN ANALYSE and this
is what I got:

QUERY PLAN
------------------------------------------------------------------------------

Merge Cond: (users.uid = p.current_owner)
-> Index Scan using pk_users on users (cost=0.27..393.66 rows=414
width=198) (actual time=0.007..0.007 rows=1 loops=1)
-> Index Scan using idx_property_current_owner on property p
(cost=0.43..9889803.81 rows=3201653 width=775) (actual
time=10785.283..10785.2..83 rows=0 loops=1)
Filter: (SubPlan 1)
Rows Removed by Filter: 760651
SubPlan 1
-> Limit (cost=1.31..1.32 rows=1 width=16) (actual time=0.011..0.011
rows=0 loops=760651)
-> Sort (cost=1.31..1.32 rows=1 width=16) (actual time=0.008..0.008
rows=0 loops=760651)
Sort Key: reward.reward_date DESC
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_reward_reward_type on reward (cost=0.29..1.30
rows=1 width=16) (actual time=0.005..0.005 rows=0.. loops=760651)
Index Cond: (reward_type = 'daily'::text)
Planning time: 5.575 ms
Execution time: 10785.510 ms
---------------------------------

You can see the index scan on property is done first, like so:

Index Scan using idx_property_current_owner on property p
(cost=0.43..9889803.81 rows=3201653 width=775)
(actual time=10785.283..10785.2..83 rows=0 loops=1)

This take 10+ seconds. But this is wasted, since the number of property
rows are then filtered to less than 20 later on (via the WHERE clause).

Any suggestions on how to do the filtering FIRST and then do the join ?
I need to left outer join with users since the user (current_owner)
might be null.

Best,
--J

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Hursh Jain 2017-01-16 00:54:47 Re: Queries with Joins before filtering taking too much time! Filter (where clause) *first* -- suggestions ?
Previous Message John R Pierce 2017-01-11 23:24:56 Re: Database of articles, LaTeX code and pictures