Quick Links

Poor performance when joining against inherited tables

From:	Lucas Madar <madar(at)samsix(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Poor performance when joining against inherited tables
Date:	2011-04-11 20:11:58
Message-ID:	sig.50826e4e2f.4DA3608E.9090906@samsix.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I have a database that contains many tables, each with some common
characteristics. For legacy reasons, they have to be implemented in a
way so that they are *all* searchable by an older identifier to find the
newer identifier. To do this, we've used table inheritance.

Each entry has an id, as well as a legacyid1 and legacyid2. There's a
master table that the application uses, containing a base representation
and common characteristics:

objects ( id, ... )
item ( id, legacyid1, legacyid2 )
| - itemXX
| - itemYY

There is nothing at all in the item table, it's just used for
inheritance. However, weird things happen when this table is joined:

EXPLAIN ANALYZE SELECT * FROM objects INNER JOIN item f USING ( id );

QUERY PLAN
------------
Hash Join (cost=457943.85..1185186.17 rows=8643757 width=506)
Hash Cond: (f.id = objects.id)
-> Append (cost=0.00..224458.57 rows=8643757 width=20)
-> Seq Scan on item f (cost=0.00..26.30 rows=1630 width=20)
-> Seq Scan on itemXX f (cost=0.00..1.90 rows=90 width=20)
-> Seq Scan on itemYY f (cost=0.00..7.66 rows=266 width=20)
-> Seq Scan on itemZZ f (cost=0.00..1.02 rows=2 width=20)
...
-> Hash (cost=158447.49..158447.49 rows=3941949 width=490)
-> Seq Scan on objects (cost=0.00..158447.49 rows=3941949
width=490)

This scans everything over everything, and obviously takes forever
(there are millions of rows in the objects table, and tens of thousands
in each itemXX table).

However, if I disable seqscan (set enable_seqscan=false), I get the
following plan:

QUERY PLAN
------------
Hash Join (cost=10001298843.53..290002337961.71 rows=8643757 width=506)
Hash Cond: (f.id = objects.id)
-> Append (cost=10000000000.00..290000536334.43 rows=8643757 width=20)
-> Seq Scan on item f (cost=10000000000.00..10000000026.30
rows=1630 width=20)
-> Index Scan using xxx_pkey on itemXX f (cost=0.00..10.60
rows=90 width=20)
-> Index Scan using yyy_pkey on itemYY f (cost=0.00..25.24
rows=266 width=20)
-> Index Scan using zzz_pkey on itemZZ f (cost=0.00..9.28
rows=2 width=20)
...
-> Hash (cost=999347.17..999347.17 rows=3941949 width=490)
-> Index Scan using objects_pkey on objects
(cost=0.00..999347.17 rows=3941949 width=490)

This seems like a much more sensible query plan. But it seems to think
doing a sequential scan on the *empty* item table is excessively
expensive in this case.

Aside from enable_seqscan=false, is there any way I can make the query
planner not balk over doing a seqscan on an empty table?

Thanks,
Lucas Madar

Responses

Re: Poor performance when joining against inherited tables at 2011-04-12 13:22:05 from Shaun Thomas
Re: Poor performance when joining against inherited tables at 2011-05-11 16:38:44 from Robert Haas

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Glyn Astill	2011-04-11 20:17:04	Re: Linux: more cores = less concurrency.
Previous Message	david	2011-04-11 20:04:58	Re: Linux: more cores = less concurrency.