Problems with hash join over nested loop

From: Jim Nasby <jnasby(at)enova(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Problems with hash join over nested loop
Date: 2013-10-28 21:58:15
Message-ID: 526EDDF7.3090102@enova.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I've been working on trying to normalize a table that's got a bunch of text fields. Normalizing the first 4 has been a non-issue. But when I try and normalize 2 additional fields a bunch of query plans go belly-up.

I've seen this on our old 8.4 databases as well as 9.1. I haven't been able to test on anything newer yet.

A representative query is below. Note that this is a stripped down version of something larger, so it seems a bit silly the way it's written, but I've got others that are exhibiting the same bad behavior:

EXPLAIN ANALYZE
select * from (
select distinct e.login AS Login, d.date1, d.date2,
(SELECT COUNT(DISTINCT n.customer_id) FROM notes n
inner join loans l on n.customer_id = l.customer_id
inner join loan_tasks_committed as ltc on l.id = ltc.loan_id
AND n.note_time BETWEEN date1 AND date2 + interval '1 day'
AND n.activity_cd in ('help_draw')
AND ltc.created_on between n.note_time - interval '10 minutes' and n.note_time + interval '1 day'
AND getcust(ltc.created_by)=e.login
where n.employee_id = e.id
) AS "Draw"
FROM
(select current_date-1 as date1, current_date-1 as date2) as d, employees e
inner join employees_roles e_r on e.id=e_r.employee_id
WHERE
e_r.role_id IN (3,23) --Inbound Customer Support Operator or Customer Service Issue Rep
GROUP BY
Login,
e.id,
date1,
date2
ORDER BY Login) a
WHERE "Draw" > 0
;

loan_tasks_committed is the table that's been normalized. For testing I'm swapping between tables with two normalization views that boil down to:

_by:
SELECT lt_by.id, lt_by.lock_version, c.id AS created_by_id, c.by AS created_by, u.id AS updated_by_id, u.by AS updated_by, lt_by.created_on, lt_by.updated_on, lt_by.loan_id, lt_by.entered_on, lt_by.acct_date, lt_by.task_amount, lt_by.loan_task_cd, lt_by.parent_id, lt_by.type_cd, e.id AS entered_by_id, e.by AS entered_by, cl.id AS collected_by_id, cl.by AS collected_by, lt_by.currency_cd, lt_by.committed, lt_by.loan_task_code_id
FROM lt_by
LEFT JOIN by_text c ON lt_by.created_by_id = c.id
LEFT JOIN by_text u ON lt_by.updated_by_id = u.id
LEFT JOIN by_text e ON lt_by.entered_by_id = e.id
LEFT JOIN by_text cl ON lt_by.collected_by_id = cl.id;

_cd:
SELECT lt_cd.id, lt_cd.lock_version, c.id AS created_by_id, c.by AS created_by, u.id AS updated_by_id, u.by AS updated_by, lt_cd.created_on, lt_cd.updated_on, lt_cd.loan_id, lt_cd.entered_on, lt_cd.acct_date, lt_cd.task_amount, ltc.id AS loan_task_code_id, ltc.loan_task_code, ltc.loan_task_code AS loan_task_cd, lt_cd.parent_id, tc.id AS loan_task_type_id, tc.loan_task_type, tc.loan_task_type AS type_cd, e.id AS entered_by_id, e.by AS entered_by, cl.id AS collected_by_id, cl.by AS collected_by, lt_cd.currency_cd, lt_cd.committed
FROM lt_cd
LEFT JOIN by_text c ON lt_cd.created_by_id = c.id
LEFT JOIN by_text u ON lt_cd.updated_by_id = u.id
LEFT JOIN by_text e ON lt_cd.entered_by_id = e.id
LEFT JOIN by_text cl ON lt_cd.collected_by_id = cl.id
LEFT JOIN lt_code ltc ON lt_cd.loan_task_code_id = ltc.id
LEFT JOIN lt_type tc ON lt_cd.loan_task_type_id = tc.id;

As you can see, they're identical except for normalizing 2 additional fields.

The additional normalization results in a moderate amount of heap space savings:

SELECT table_name, rows, heap_size, index_size, toast_size, total_size FROM tools.space WHERE table_schema='jnasby';
table_name | rows | heap_size | index_size | toast_size | total_size
------------------+-------------+------------+------------+------------+------------
by_text | 721492 | 41 MB | 49 MB | 40 kB | 90 MB
lt_code | 42 | 8192 bytes | 32 kB | 32 kB | 72 kB
lt_type | 3 | 8192 bytes | 32 kB | 32 kB | 72 kB
lt_cd | 9.82601e+07 | 10 GB | 15 GB | 2832 kB | 25 GB
lt_by | 9.82615e+07 | 12 GB | 21 GB | 3360 kB | 33 GB

I've got full explain analyze below, but the relevant bits are:

_by:
SubPlan 2
-> Aggregate (cost=489.06..489.07 rows=1 width=4) (actual time=0.083..0.083 rows=1 loops=692)
-> Nested Loop (cost=0.00..489.05 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=692)
-> Nested Loop (cost=0.00..485.80 rows=10 width=8) (actual time=0.079..0.079 rows=0 loops=692)
Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval)))
-> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.054..0.059 rows=0 loops=692)
-> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.053..0.057 rows=0 loops=692)
Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
Filter: ((activity_cd)::text = 'help_draw'::text)
-> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.034..0.042 rows=2 loops=18)
Index Cond: (customer_id = n.customer_id)
-> Index Scan using lt_by__loan_id on lt_by (cost=0.00..6.50 rows=36 width=28) (actual time=0.043..0.477 rows=60 loops=28)
Index Cond: (loan_id = l.id)
-> Index Scan using by_text_pkey on by_text c (cost=0.00..0.31 rows=1 width=4) (never executed)
Index Cond: (id = jnasby.lt_by.created_by_id)
Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)

_cd:
SubPlan 2
-> Aggregate (cost=3089331.15..3089331.16 rows=1 width=4) (actual time=372.589..372.589 rows=1 loops=692)
-> Hash Join (cost=16560.08..3089331.14 rows=1 width=4) (actual time=372.586..372.586 rows=0 loops=692)
Hash Cond: (jnasby.lt_cd.loan_id = l.id)
Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval)))
-> Hash Join (cost=16176.47..3087105.37 rows=491238 width=12) (actual time=14610.882..32223.008 rows=2642 loops=8)
Hash Cond: (jnasby.lt_cd.created_by_id = c.id)
-> Seq Scan on lt_cd (cost=0.00..2329065.44 rows=98260144 width=32) (actual time=0.018..15123.405 rows=98261588 loops=8)
-> Hash (cost=16131.38..16131.38 rows=3607 width=4) (actual time=4519.878..4519.878 rows=20 loops=8)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on by_text c (cost=0.00..16131.38 rows=3607 width=4) (actual time=4405.172..4519.861 rows=20 loops=8)
Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
-> Hash (cost=383.44..383.44 rows=14 width=16) (actual time=0.055..0.055 rows=0 loops=692)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.051..0.055 rows=0 loops=692)
-> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.049..0.052 rows=0 loops=692)
Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
Filter: ((activity_cd)::text = 'help_draw'::text)
-> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.033..0.040 rows=2 loops=18)
Index Cond: (customer_id = n.customer_id)

I've tried disabling execution paths to get it to loop join, but no matter what it insists on seqscanning lt_cd. I've verified that simpler queries will use an index scan of loan_id (ie: WHERE loan_id < 999999).

Again, I've got multiple queries exhibiting this problem... this is NOT directly related to this query.

_by: http://explain.depesz.com/s/LYOY

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on a (cost=843581.41..843600.81 rows=862 width=24) (actual time=64.735..64.735 rows=0 loops=1)
-> Unique (cost=843581.41..843592.19 rows=862 width=20) (actual time=64.735..64.735 rows=0 loops=1)
-> Sort (cost=843581.41..843583.57 rows=862 width=20) (actual time=64.733..64.733 rows=0 loops=1)
Sort Key: e.login, ((('now'::text)::date - 1)), ((('now'::text)::date - 1)), ((SubPlan 1))
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=380.35..843539.38 rows=862 width=20) (actual time=64.726..64.726 rows=0 loops=1)
Filter: ((SubPlan 2) > 0)
-> Hash Join (cost=305.29..371.73 rows=862 width=20) (actual time=4.765..5.866 rows=895 loops=1)
Hash Cond: (e_r.employee_id = e.id)
-> Nested Loop (cost=15.18..67.61 rows=862 width=12) (actual time=0.228..0.976 rows=895 loops=1)
-> Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.009..0.009 rows=1 loops=1)
-> Bitmap Heap Scan on employees_roles e_r (cost=15.18..58.96 rows=862 width=4) (actual time=0.214..0.782 rows=895 loops=1)
Recheck Cond: (role_id = ANY ('{3,23}'::integer[]))
-> Bitmap Index Scan on employees_roles_m1 (cost=0.00..14.97 rows=862 width=0) (actual time=0.195..0.195 rows=895 loops=1)
Index Cond: (role_id = ANY ('{3,23}'::integer[]))
-> Hash (cost=247.27..247.27 rows=3427 width=12) (actual time=4.521..4.521 rows=3427 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 154kB
-> Seq Scan on employees e (cost=0.00..247.27 rows=3427 width=12) (actual time=0.021..3.595 rows=3427 loops=1)
SubPlan 1
-> Aggregate (cost=489.06..489.07 rows=1 width=4) (never executed)
-> Nested Loop (cost=0.00..489.05 rows=1 width=4) (never executed)
-> Nested Loop (cost=0.00..485.80 rows=10 width=8) (never executed)
Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval)))
-> Nested Loop (cost=0.00..383.44 rows=14 width=16) (never executed)
-> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (never executed)
Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
Filter: ((activity_cd)::text = 'help_draw'::text)
-> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (never executed)
Index Cond: (customer_id = n.customer_id)
-> Index Scan using lt_by__loan_id on lt_by (cost=0.00..6.50 rows=36 width=28) (never executed)
Index Cond: (loan_id = l.id)
-> Index Scan using by_text_pkey on by_text c (cost=0.00..0.31 rows=1 width=4) (never executed)
Index Cond: (id = jnasby.lt_by.created_by_id)
Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
SubPlan 2
-> Aggregate (cost=489.06..489.07 rows=1 width=4) (actual time=0.083..0.083 rows=1 loops=692)
-> Nested Loop (cost=0.00..489.05 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=692)
-> Nested Loop (cost=0.00..485.80 rows=10 width=8) (actual time=0.079..0.079 rows=0 loops=692)
Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval)))
-> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.054..0.059 rows=0 loops=692)
-> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.053..0.057 rows=0 loops=692)
Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
Filter: ((activity_cd)::text = 'help_draw'::text)
-> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.034..0.042 rows=2 loops=18)
Index Cond: (customer_id = n.customer_id)
-> Index Scan using lt_by__loan_id on lt_by (cost=0.00..6.50 rows=36 width=28) (actual time=0.043..0.477 rows=60 loops=28)
Index Cond: (loan_id = l.id)
-> Index Scan using by_text_pkey on by_text c (cost=0.00..0.31 rows=1 width=4) (never executed)
Index Cond: (id = jnasby.lt_by.created_by_id)
Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
Total runtime: 65.070 ms
(51 rows)

_cd: http://explain.depesz.com/s/fkc
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on a (cost=5326007345.03..5326007364.42 rows=862 width=24) (actual time=257838.254..257838.254 rows=0 loops=1)
-> Unique (cost=5326007345.03..5326007355.80 rows=862 width=20) (actual time=257838.254..257838.254 rows=0 loops=1)
-> Sort (cost=5326007345.03..5326007347.18 rows=862 width=20) (actual time=257838.254..257838.254 rows=0 loops=1)
Sort Key: e.login, ((('now'::text)::date - 1)), ((('now'::text)::date - 1)), ((SubPlan 1))
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=380.35..5326007303.00 rows=862 width=20) (actual time=257838.246..257838.246 rows=0 loops=1)
Filter: ((SubPlan 2) > 0)
-> Hash Join (cost=305.29..371.73 rows=862 width=20) (actual time=3.916..4.789 rows=895 loops=1)
Hash Cond: (e_r.employee_id = e.id)
-> Nested Loop (cost=15.18..67.61 rows=862 width=12) (actual time=0.208..0.788 rows=895 loops=1)
-> Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.006..0.006 rows=1 loops=1)
-> Bitmap Heap Scan on employees_roles e_r (cost=15.18..58.96 rows=862 width=4) (actual time=0.191..0.626 rows=895 loops=1)
Recheck Cond: (role_id = ANY ('{3,23}'::integer[]))
-> Bitmap Index Scan on employees_roles_m1 (cost=0.00..14.97 rows=862 width=0) (actual time=0.173..0.173 rows=895 loops=1)
Index Cond: (role_id = ANY ('{3,23}'::integer[]))
-> Hash (cost=247.27..247.27 rows=3427 width=12) (actual time=3.696..3.696 rows=3427 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 154kB
-> Seq Scan on employees e (cost=0.00..247.27 rows=3427 width=12) (actual time=0.012..2.967 rows=3427 loops=1)
SubPlan 1
-> Aggregate (cost=3089331.15..3089331.16 rows=1 width=4) (never executed)
-> Hash Join (cost=16560.08..3089331.14 rows=1 width=4) (never executed)
Hash Cond: (jnasby.lt_cd.loan_id = l.id)
Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval)))
-> Hash Join (cost=16176.47..3087105.37 rows=491238 width=12) (never executed)
Hash Cond: (jnasby.lt_cd.created_by_id = c.id)
-> Seq Scan on lt_cd (cost=0.00..2329065.44 rows=98260144 width=32) (never executed)
-> Hash (cost=16131.38..16131.38 rows=3607 width=4) (never executed)
-> Seq Scan on by_text c (cost=0.00..16131.38 rows=3607 width=4) (never executed)
Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
-> Hash (cost=383.44..383.44 rows=14 width=16) (never executed)
-> Nested Loop (cost=0.00..383.44 rows=14 width=16) (never executed)
-> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (never executed)
Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
Filter: ((activity_cd)::text = 'help_draw'::text)
-> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (never executed)
Index Cond: (customer_id = n.customer_id)
SubPlan 2
-> Aggregate (cost=3089331.15..3089331.16 rows=1 width=4) (actual time=372.589..372.589 rows=1 loops=692)
-> Hash Join (cost=16560.08..3089331.14 rows=1 width=4) (actual time=372.586..372.586 rows=0 loops=692)
Hash Cond: (jnasby.lt_cd.loan_id = l.id)
Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval)))
-> Hash Join (cost=16176.47..3087105.37 rows=491238 width=12) (actual time=14610.882..32223.008 rows=2642 loops=8)
Hash Cond: (jnasby.lt_cd.created_by_id = c.id)
-> Seq Scan on lt_cd (cost=0.00..2329065.44 rows=98260144 width=32) (actual time=0.018..15123.405 rows=98261588 loops=8)
-> Hash (cost=16131.38..16131.38 rows=3607 width=4) (actual time=4519.878..4519.878 rows=20 loops=8)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on by_text c (cost=0.00..16131.38 rows=3607 width=4) (actual time=4405.172..4519.861 rows=20 loops=8)
Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
-> Hash (cost=383.44..383.44 rows=14 width=16) (actual time=0.055..0.055 rows=0 loops=692)
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.051..0.055 rows=0 loops=692)
-> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.049..0.052 rows=0 loops=692)
Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
Filter: ((activity_cd)::text = 'help_draw'::text)
-> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.033..0.040 rows=2 loops=18)
Index Cond: (customer_id = n.customer_id)
Total runtime: 257838.571 ms
(57 rows)
--
Jim Nasby, Lead Data Architect (512) 569-9461

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2013-10-28 23:13:55 Re: Problems with hash join over nested loop
Previous Message sparikh 2013-10-28 20:57:00 Re: Hot Standby performance issue