Re: pg12 partitions show bad performance vs pg96

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL mailing lists <pgsql-performance(at)postgresql(dot)org>
Subject: Re: pg12 partitions show bad performance vs pg96
Date: 2020-03-09 13:08:49
Message-ID: CA+t6e1nrXp6HW8iaAmKCY=xn8hCw6yS32nW8WJGb+4FEX07zjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

OK so I found the problem but other problem appeared.
I found out that the pg12 machine had some limits on the vm settings in
aspect of cpu and memory. Now both machines are exactly the same in aspect
of all hardware and dont have any limit.
CPU - 8
RAM - 32GB.

I tested it with cold cache :
service postgresql stop;
echo 1 > /proc/sys/vm/drop_caches;
service postgresql start;
psql -d postgres -U postgres;

I used two simples queries, one that implicitly comparing the partition key
with a const value and another one that joins other table by the partition
column(and in this query the problem).

The first query : results are better with pg12 :
explain analyze select * from iot_data where device=51;

PG96
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on iot_data (cost=0.00..2083334.60 rows=976667 width=37) (actual
time=1.560..67144.164 rows=1010315 loops=1)
Filter: (device = 51)
Rows Removed by Filter: 98989685
Planning time: 9.219 ms
Execution time: 67,228.431 ms
(5 rows)

PG12 - 3 PARTITIONS
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..514086.40 rows=1027284 width=37) (actual
time=3.871..15022.118 rows=1010315 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on iot_data_0 (cost=0.00..410358.00 rows=428035
width=37) (actual time=1.670..14815.480 rows=336772 loops=3)
Filter: (device = 51)
Rows Removed by Filter: 9764341
Planning Time: 27.292 ms
Execution Time: 15085.317 ms
(8 rows)

The second query with pg12 :
QUERY : explain analyze select * from iot_data da,iot_device de where
de.name in ('50a') and de.id = da.device and da.metadata=50 and
lower(da.data) like '50';

PG96

postgres=# explain analyze select * from iot_data da,iot_device de where
de.name in ('50a') and de.id = da.device and da.metadata=50 and
lower(da.data) like '50';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2583361.51 rows=20 width=44) (actual
time=44894.312..44894.312 rows=0 loops=1)
Join Filter: (da.device = de.id)
-> Seq Scan on iot_device de (cost=0.00..2.25 rows=1 width=7) (actual
time=0.018..0.028 rows=1 loops=1)
Filter: (name = '50a'::text)
Rows Removed by Filter: 99
-> Seq Scan on iot_data da (cost=0.00..2583334.84 rows=1954 width=37)
(actual time=44894.279..44894.279 rows=0 loops=1)
Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
Rows Removed by Filter: 100000000
Planning time: 11.313 ms
Execution time: 44894.357 ms
(10 rows)

PG12 - 3 PARTITIONS

QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1002.26..1563512.35 rows=10 width=44) (actual
time=22306.091..22309.209 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=2.26..1562511.35 rows=4 width=44) (actual
time=22299.412..22299.413 rows=0 loops=3)
Hash Cond: (da_2.device = de.id)
-> Parallel Append (cost=0.00..1562506.90 rows=814 width=37)
(actual time=22299.411..22299.411 rows=0 loops=3)
-> Parallel Seq Scan on iot_data_2 da_2
(cost=0.00..584064.14 rows=305 width=37) (actual time=9076.535..9076.535
rows=0 loops=3)
Filter: ((metadata = 50) AND (lower(data) ~~
'50'::text))
Rows Removed by Filter: 12460009
-> Parallel Seq Scan on iot_data_1 da_1
(cost=0.00..504948.69 rows=262 width=36) (actual time=10296.751..10296.751
rows=0 loops=2)
Filter: ((metadata = 50) AND (lower(data) ~~
'50'::text))
Rows Removed by Filter: 16158316
-> Parallel Seq Scan on iot_data_0 da
(cost=0.00..473490.00 rows=247 width=37) (actual time=19075.081..19075.081
rows=0 loops=1)
Filter: ((metadata = 50) AND (lower(data) ~~
'50'::text))
Rows Removed by Filter: 30303339
-> Hash (cost=2.25..2.25 rows=1 width=7) (never executed)
-> Seq Scan on iot_device de (cost=0.00..2.25 rows=1
width=7) (never executed)
Filter: (name = '50a'::text)
Planning Time: 30.429 ms
Execution Time: 22309.364 ms
(20 rows)

I tried disabling max_parallel_workers_gathers but It just decreased the
db`s performance.
Now regarding the main issue here - as u can see when I used the second
query, I didnt mentioned the partition column specificly but I joined
another table based on it( where de.name in ('50a') and de.id = da.device)
This condition should be enough for the optimizer to understand that it
needs to scan a specific partition but it scans all the partitions. The
"never executed" tag isnt added to the partitions scans but it is added to
the joined table.

Justin - Regarding adding index on the parittion column - I dont understand
why ? the value in that column is the same for all rows in the partition,
when exactly the index will be used ?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mariel Cherkassky 2020-03-09 13:16:47 Re: pg12 partitions show bad performance vs pg96
Previous Message Justin Pryzby 2020-03-09 12:12:13 Re: pg12 partitions show bad performance vs pg96