Quick Links

Re: plan_rows confusion with parallel queries

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: plan_rows confusion with parallel queries
Date:	2016-11-02 20:00:46
Message-ID:	25945.1478116846@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> while eye-balling some explain plans for parallel queries, I got a bit
> confused by the row count estimates. I wonder whether I'm alone.

I got confused by that a minute ago, so no you're not alone. The problem
is even worse in join cases. For example:

Gather (cost=34332.00..53265.35 rows=100 width=8)
Workers Planned: 2
-> Hash Join (cost=33332.00..52255.35 rows=100 width=8)
Hash Cond: ((pp.f1 = cc.f1) AND (pp.f2 = cc.f2))
-> Append (cost=0.00..8614.96 rows=417996 width=8)
-> Parallel Seq Scan on pp (cost=0.00..8591.67 rows=416667 widt
h=8)
-> Parallel Seq Scan on pp1 (cost=0.00..23.29 rows=1329 width=8
)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8)
-> Seq Scan on cc (cost=0.00..14425.00 rows=1000000 width=8)

There are actually 1000000 rows in pp, and none in pp1. I'm not bothered
particularly by the nonzero estimate for pp1, because I know where that
came from, but I'm not very happy that nowhere here does it look like
it's estimating a million-plus rows going into the join.

regards, tom lane

In response to

plan_rows confusion with parallel queries at 2016-11-02 18:42:16 from Tomas Vondra

Responses

Re: plan_rows confusion with parallel queries at 2016-11-02 22:56:37 from Tomas Vondra
Re: plan_rows confusion with parallel queries at 2016-11-03 14:44:13 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kevin Grittner	2016-11-02 20:42:20	Re: delta relations in AFTER triggers
Previous Message	Tomas Vondra	2016-11-02 18:42:16	plan_rows confusion with parallel queries