(was: CTE with JOIN of two tables is much faster than a regular query)

From: Albrecht Dreß <albrecht(dot)dress(at)arcor(dot)de>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: (was: CTE with JOIN of two tables is much faster than a regular query)
Date: 2018-08-20 19:08:54
Message-ID: 4YW55TYN.OI42YDTK.ITCKMMOO@AJ45T7CZ.3PZZWAHB.65FILMPD
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Am 18.08.18 11:36 schrieb(en) kpi6288(at)gmail(dot)com:
[snip]
> What can I do to improve the performance of the regular query without using a CTE?

Sorry for jumping into this discussion late – I'm facing similar problems with Postgres choosing strange and inefficient query plans for no (for me) apparent reason. I use the DEB packages postgresql-10, version 10.5-1.pgdg90+1, on a Debian stretch box.

The relevant part of the database structure is:

--8<-----------------------------------------------------------------------------------------------
mydb=> \d strings
Table "public.strings"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+--------------------------------------
iid | bigint | | not null |
sid | bigint | | not null | nextval('strings_sid_seq'::regclass)
stype | text | | |
string | text | | |
Indexes:
"strings_pkey" PRIMARY KEY, btree (iid, sid)
"idx_strings_string_gin" gin (string gin_trgm_ops)
"idx_stype" btree (stype)
Foreign-key constraints:
"strings_iid_fkey" FOREIGN KEY (iid) REFERENCES items(iid) ON DELETE CASCADE

mydb=> \d items
Table "public.items"
Column | Type | Collation | Nullable | Default
---------------+---------------+-----------+----------+------------------------------------
dbid | bigint | | not null |
iid | bigint | | not null | nextval('items_iid_seq'::regclass)
riid | integer | | |
[…more columns…]
Indexes:
"items_pkey" PRIMARY KEY, btree (iid)
"idx_items_riid" btree (riid)
"items_dbid" btree (dbid)
[…more indexes…]
Referenced by:
TABLE "strings" CONSTRAINT "strings_iid_fkey" FOREIGN KEY (iid) REFERENCES items(iid) ON DELETE CASCADE
[…more references…]
--8<-----------------------------------------------------------------------------------------------

The table “strings” contains about 2 * 10e7 active rows, “items” about 10e8.

The “instability” occurs with the following somewhat trivial query. In the correct (IMO) case, the indexes are used:

--8<-----------------------------------------------------------------------------------------------
mydb=> EXPLAIN ANALYZE SELECT items.iid, stype, string, riid FROM items LEFT JOIN strings USING(iid) WHERE stype ~ E'^tag\\..*(?<\!\\.\\d+)$' AND dbid = 7416000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.13..522716.95 rows=8 width=133) (actual time=0.078..0.715 rows=16 loops=1)
-> Index Scan using items_dbid on items (cost=0.57..1377.96 rows=773 width=12) (actual time=0.021..0.038 rows=19 loops=1)
Index Cond: (dbid = 7416000)
-> Index Scan using strings_pkey on strings (cost=0.56..674.18 rows=26 width=129) (actual time=0.030..0.035 rows=1 loops=19)
Index Cond: (iid = items.iid)
Filter: (stype ~ '^tag\..*(?<!\.\d+)$'::text)
Rows Removed by Filter: 3
Planning time: 1.685 ms
Execution time: 0.762 ms
(9 rows)
--8<-----------------------------------------------------------------------------------------------

However, seemingly at random, Postgres chooses the following plan which is (planning plus execution) ~1500 times slower:

--8<-----------------------------------------------------------------------------------------------
mydb=> EXPLAIN ANALYZE SELECT items.iid, stype, string, riid FROM items LEFT JOIN strings USING(iid) WHERE stype ~ E'^tag\\..*(?<\!\\.\\d+)$' AND dbid = 7416000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=84945.47..522033.97 rows=9 width=133) (actual time=1401.570..3868.239 rows=16 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=83945.47..521033.07 rows=4 width=133) (actual time=2206.088..3823.982 rows=5 loops=3)
Hash Cond: (strings.iid = items.iid)
-> Parallel Bitmap Heap Scan on strings (cost=82539.52..518233.10 rows=531057 width=129) (actual time=390.479..3795.902 rows=401149 loops=3)
Filter: (stype ~ '^tag\..*(?<!\.\d+)$'::text)
Rows Removed by Filter: 384802
Heap Blocks: exact=76067
-> Bitmap Index Scan on idx_stype (cost=0.00..82220.88 rows=2334832 width=0) (actual time=340.725..340.725 rows=2357863 loops=1)
Index Cond: ((stype >= 'tag.'::text) AND (stype < 'tag/'::text))
-> Hash (cost=1395.77..1395.77 rows=814 width=12) (actual time=0.137..0.137 rows=19 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using items_dbid on items (cost=0.57..1395.77 rows=814 width=12) (actual time=0.072..0.126 rows=19 loops=3)
Index Cond: (dbid = 7416000)
Planning time: 2.617 ms
Execution time: 3868.303 ms
(17 rows)
--8<-----------------------------------------------------------------------------------------------

It looks as if the selection of the plan is more or less random, and does /not/ depend on the statistics state. I.e. running “vacuum analyze strings; vacuum analyze items;” immediately before the query does /not/ result in a reproducible behaviour (a /very/ small number if entries may have been added or deleted between the calls in both tables, though).

My solution for a stable (but slower than the query utilising the indexes) response time is also using a CTE. However, it would be helpful to fix (or at least understand) the behaviour.

Best,
Albrecht.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Frost 2018-08-20 19:15:43 Re: upgrading from pg 9.3 to 10
Previous Message bricklen 2018-08-20 18:45:10 Re: upgrading from pg 9.3 to 10