Re: Poor plan when joining against a union containing a join

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Poor plan when joining against a union containing a join
Date: 2013-03-06 23:15:08
Message-ID: 5137CDFC.1060902@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 03/06/2013 06:54 AM, David Leverton wrote:
> Hi all,
>
> I'm encountering very poor query plans when joining against a union,
> where one branch of the union is itself a join: specifically, Postgres
> does the entire inside join and then filters the result, rather than
> pushing the filters down to the joined tables. I've provided a
> standalone test case below, including some variations that don't show
> the problem for comparison. The fourth query is the problematic one -
> I would have expected the Append portion of the plan to be essentially
> the same as the one in the second query.

Thanks for the test case!

Actually, in case #4, Postgres *is* pushing down the join qual into the
segments of the Union. It's just that that's not helping performance
any; it's causing a really slow join on bundle, which is actually where
you're spending most of your time:

-> Hash Join (cost=27.50..12970.50 rows=1000000
width=8) (actual time=0.617..344.127 rows=1000000 loops=1)
Hash Cond: (bundle.bundle_type =
bundle_contents.bundle_type)
Buffers: shared hit=448
-> Seq Scan on bundle (cost=0.00..1443.00
rows=100000 width=8) (actual time=0.009..22.066 rows=100000 loops=1)
Buffers: shared hit=443
-> Hash (cost=15.00..15.00 rows=1000 width=8)
(actual time=0.594..0.594 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 40kB
Buffers: shared hit=5
-> Seq Scan on bundle_contents
(cost=0.00..15.00 rows=1000 width=8) (actual time=0.008..0.207
rows=1000 loops=1)
Buffers: shared hit=5

Clearly this is the wrong strategy; Postgres should be letting the
filter on item_reference be the driver instead of hashing the whole
bundle + bundle_contents join. I suspect that the qual pushdown into
the union is hitting an inability to transverse multiple joins, which
wouldn't surprise me; in fact, I'd be surprised if it could do so.

On a pragmatic basis, joining against complex UNION expressions is
liable to be a bit of a minefield for the next few generations of the
Postgres planner; it's just really hard to optimize. You might think of
using outer joins instead of a UNION.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2013-03-07 05:52:35 Re: Poor plan when joining against a union containing a join
Previous Message Jeff Janes 2013-03-06 16:44:00 Re: Optimize SELECT * from table WHERE foreign_key_id IN (key1, key2, key3, key4...)