Regression from 9.4-9.6

From: Jim Nasby <jim(at)nasby(dot)net>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Regression from 9.4-9.6
Date: 2017-10-08 19:25:54
Message-ID: 5d7c0f80-3767-1531-d911-197fcc147173@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I've got a query that's regressed from 9.4 to 9.6. I suspect it has
something to do with the work done around bad plans from single-row
estimates. There's a SeqScan happening even though the join is to the PK
of bd_ident. Full plans are at [1,2,3], but here's the relevant bits...

9.4:
> -> Nested Loop Left Join (cost=1.00..50816.55 rows=1 width=27) (actual time=979.406..3213.286 rows=508 loops=1)
> -> Index Scan using bdata_filed_departuretime on bdata_forks (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 rows=508 loops=1)
> Index Cond: ((filed_departuretime >= '2017-07-20 05:00:00'::timestamp without time zone) AND (filed_departuretime <= '2017-07-30 04:59:59'::timestamp without time zone))
...
> -> Index Scan using bd_ident_pkey on bd_ident i (cost=0.43..4.45 rows=1 width=11) (actual time=0.006..0.006 rows=1 loops=508)
> Index Cond: (bdata_forks.ident_id = id)
> SubPlan 1
> -> Index Scan using bd_airport_pkey on bd_airport (cost=0.56..4.58 rows=1 width=20) (actual time=0.003..0.003 rows=1 loops=508)
> Index Cond: (id = bdata_forks.destination_id)

9.6:
> -> Nested Loop Left Join (cost=0.57..14994960.40 rows=1 width=71) (actual time=931.479..327972.891 rows=508 loops=1)
> Join Filter: (bdata_forks.ident_id = i.id)
> Rows Removed by Join Filter: 1713127892
> -> Index Scan using bdata_filed_departuretime on bdata_forks (cost=0.57..14894236.06 rows=1 width=36) (actual time=892.664..3025.653 rows=508 loops=1)
...
> -> Seq Scan on bd_ident i (cost=0.00..58566.00 rows=3372300 width=11) (actual time=0.002..280.966 rows=3372300 loops=508)
^^^^^^^^
> SubPlan 1
> -> Index Scan using bd_airport_pkey on bd_airport (cost=0.56..4.58 rows=1 width=20) (actual time=0.009..0.009 rows=1 loops=508)
> Index Cond: (id = bdata_forks.destination_id)

Altering the predicates somewhat (removing one of the timestamp
conditions) results in the input to the outer part of the nested loop
estimating at 326 rows instead of 1, which generates a good plan:

> -> Nested Loop Left Join (cost=1.00..14535906.91 rows=326 width=71) (actual time=23.670..4558.273 rows=3543 loops=1)
> -> Index Scan using bdata_filed_departuretime on bdata_forks (cost=0.57..14532973.05 rows=326 width=36) (actual time=23.647..4522.428 rows=3543 loops=1)

^^^^^^^^
...
> -> Index Scan using bd_ident_pkey on bd_ident i (cost=0.43..4.40 rows=1 width=11) (actual time=0.005..0.006 rows=1 loops=3543)
> Index Cond: (bdata_forks.ident_id = id)
> SubPlan 1
> -> Index Scan using bd_airport_pkey on bd_airport (cost=0.56..4.58 rows=1 width=20) (actual time=0.003..0.003 rows=1 loops=3543)
> Index Cond: (id = bdata_forks.destination_id)

1: https://explain.depesz.com/s/2A90
2: https://explain.depesz.com/s/jKdr
3: https://explain.depesz.com/s/nFh
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2017-10-08 19:34:51 Re: Regression from 9.4-9.6
Previous Message David Rowley 2017-10-03 21:44:16 Re: How does max_parallel_workers_per_gather change load averages?