Re: Slow join over three tables

From: Karl Czajkowski <karlcz(at)isi(dot)edu>
To: Tim van der Linden <tim(at)shisaa(dot)jp>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Slow join over three tables
Date: 2016-04-27 15:15:55
Message-ID: 20160427151555.GA13173@moraine.isi.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Apr 27, Tim van der Linden modulated:
...
> I'm joining three fairly large tables together, and it is slow. The tables are:
>
> - "reports": 6 million rows
> - "report_drugs": 20 million rows
> - "report_adverses": 20 million rows
>
...
> All tables have indexes on the "id"/"rid" columns and on the "drug"/"adverse" columns.
>
> The query:
>
> SELECT r.id, r.age, r.gender, r.created, a.adverse, d.drug
> FROM reports r
> JOIN report_drugs d ON d.rid = r.id
> JOIN report_adverses a ON a.rid = r.id
> WHERE a.adverse = ANY (ARRAY['back pain - nonspecific', 'nonspecific back pain', 'back pain'])
> AND d.drug = ANY (ARRAY[359, 360, 361, 362, 363]) ORDER BY r.created;
>

I would suggest a few experiments to see how you can modify the plans
available to the optimizer:

1. CREATE INDEX ON report_drugs (drug, rid)
2. CREATE INDEX ON report_adverses (adverse, rid)
3. CREATE INDEX ON report (id, created)

Re-run EXPLAIN ANALYZE of your query after each of these steps to see
how each one affects planning. You might also try two variants of the
query at each step, with and without the ORDER BY.

Note, the index column order in (1) and (2) above is very important.

Karl

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Will McCormick 2016-04-27 15:25:17 Re: BDR Alter table failing
Previous Message Daniel Westermann 2016-04-27 14:39:48 Re: PG 9.5.2, freetds + tds_fdw => server crash