Re: Very poor performance with Nested Loop Anti Join

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Joseph Krogh <andreas(at)visena(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Very poor performance with Nested Loop Anti Join
Date: 2016-08-01 23:15:05
Message-ID: 21013.1470093305@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Andreas Joseph Krogh <andreas(at)visena(dot)com> writes:
> This query performs terribly slow (~26 minutes,1561346.597ms):

Seems like the key misestimation is on the inner antijoin:

> -> Hash Anti Join (cost=654.21..4008.72 rows=1 width=8) (actual time=9.016..40.672 rows=76174 loops=1)
> Hash Cond: (il.invoice_id = creditnote.credit_against)
> -> Seq Scan on onp_crm_invoice_line il (cost=0.00..3062.01 rows=78001 width=8) (actual time=0.005..11.259 rows=78614 loops=1)
> -> Hash (cost=510.56..510.56 rows=11492 width=8) (actual time=8.940..8.940 rows=372 loops=1)
> Buckets: 16384 Batches: 1 Memory Usage: 143kB
> -> Seq Scan on onp_crm_invoice creditnote (cost=0.00..510.56 rows=11492 width=8) (actual time=0.014..7.882 rows=11507 loops=1)
> Filter: ((sent_date <= '2016-06-27'::date) AND ((status_key)::text = 'INVOICE_STATUS_INVOICED'::text))
> Rows Removed by Filter: 149

If it realized that this produces 78k rows not 1, it'd likely do something
smarter at the outer antijoin.

I have no idea why that estimate's so far off though. What PG version is
this? Stats all up to date on these two tables? Are the rows excluded
by the filter condition on "creditnote" significantly different from the
rest of that table?

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Om Prakash Jaiswal 2016-08-02 05:47:19 Create language plperlu Error
Previous Message Andreas Joseph Krogh 2016-08-01 13:40:48 Re: Very poor performance with Nested Loop Anti Join