Re: Slow join over three tables

From: Tim van der Linden <tim(at)shisaa(dot)jp>
To: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Slow join over three tables
Date: 2016-04-27 10:10:51
Message-ID: 20160427191051.0fd2ca76cd88dda386899f7c@shisaa.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 27 Apr 2016 01:45:55 +0000
Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com> wrote:

Hi Sameer

Thanks for taking the time to look into this!

> > ...
> Quite clearly the nested loop joins are the most costly operations here.

Indeed.

> > ...
> I suppose. It might help if the filters are performed before the join. I am
> not an expert on optimizer but I guess it might help if you change the join
> order and add duplicate conditions for reports-
>
> SELECT r.id, r.age, r.gender, r.created, a.adverse, d.drug
> FROM report_drugs d
> JOIN report_adverses a ON a.rid = d.rid
> JOIN reports r ON d.rid = r.id
> WHERE a.adverse = ANY (ARRAY['back pain - nonspecific', 'nonspecific back
> pain', 'back pain'])
> AND d.drug = ANY (ARRAY[359, 360, 361, 362, 363]) ORDER BY r.created;

Looks like a nice approach, but it did no effect to the query time. The plan for this approach:

Sort (cost=104928.07..104928.86 rows=317 width=41) (actual time=5435.210..5435.236 rows=448 loops=1)
Sort Key: r.created
Sort Method: quicksort Memory: 60kB
-> Nested Loop (cost=1.31..104914.90 rows=317 width=41) (actual time=57.230..5434.930 rows=448 loops=1)
Join Filter: (d.rid = a.rid)
-> Nested Loop (cost=0.87..93919.79 rows=13870 width=28) (actual time=0.569..2240.955 rows=14200 loops=1)
-> Index Scan using report_drugs_drug_idx on report_drugs d (cost=0.44..496.92 rows=13870 width=8) (actual time=0.565..4.678 rows=14200 loops=1)
Index Cond: (drug = ANY ('{359,360,361,362,363}'::integer[]))
-> Index Scan using reports_id_key on reports r (cost=0.43..6.73 rows=1 width=20) (actual time=0.157..0.157 rows=1 loops=14200)
Index Cond: (id = d.rid)
-> Index Scan using report_adverses_rid_idx on report_adverses a (cost=0.44..0.78 rows=1 width=21) (actual time=0.224..0.225 rows=0 loops=14200)
Index Cond: (rid = r.id)
Filter: (adverse = ANY ('{"back pain - nonspecific","nonspecific back
pain","back pain"}'::text[]))
Rows Removed by Filter: 5
Planning time: 18.512 ms
Execution time: 5435.293 ms

> OR since you are using INNER JOIN, (As far as I understand the concept of
> joins) it won't hurt the result set if the where clause is pushed into the
> INNER JOIN criteria-

Correct. I have tried those as well, but the planner seems to take the exact same path and as a result the query time is unchanged.

> > ...
>
> --
> Best Regards
> Sameer Kumar | DB Solution Architect

Cheers,
Tim

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tim van der Linden 2016-04-27 10:29:18 Re: Slow join over three tables
Previous Message Ihnat Peter | TSS Group a.s. 2016-04-27 08:40:49 Re: Background worker with Listen