Re: pg12 partitions show bad performance vs pg96

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL mailing lists <pgsql-performance(at)postgresql(dot)org>
Subject: Re: pg12 partitions show bad performance vs pg96
Date: 2020-03-09 21:43:46
Message-ID: CAApHDvpCenS3t_dV99P==XOvO6EJi9575hdBz43JqwdCWuz8Mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 10 Mar 2020 at 02:08, Mariel Cherkassky
<mariel(dot)cherkassky(at)gmail(dot)com> wrote:

> PG12 - 3 PARTITIONS
>
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> Gather (cost=1002.26..1563512.35 rows=10 width=44) (actual time=22306.091..22309.209 rows=0 loops=1)
> Workers Planned: 2
> Workers Launched: 2
> -> Hash Join (cost=2.26..1562511.35 rows=4 width=44) (actual time=22299.412..22299.413 rows=0 loops=3)
> Hash Cond: (da_2.device = de.id)
> -> Parallel Append (cost=0.00..1562506.90 rows=814 width=37) (actual time=22299.411..22299.411 rows=0 loops=3)
> -> Parallel Seq Scan on iot_data_2 da_2 (cost=0.00..584064.14 rows=305 width=37) (actual time=9076.535..9076.535 rows=0 loops=3)
> Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
> Rows Removed by Filter: 12460009
> -> Parallel Seq Scan on iot_data_1 da_1 (cost=0.00..504948.69 rows=262 width=36) (actual time=10296.751..10296.751 rows=0 loops=2)
> Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
> Rows Removed by Filter: 16158316
> -> Parallel Seq Scan on iot_data_0 da (cost=0.00..473490.00 rows=247 width=37) (actual time=19075.081..19075.081 rows=0 loops=1)
> Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
> Rows Removed by Filter: 30303339
> -> Hash (cost=2.25..2.25 rows=1 width=7) (never executed)
> -> Seq Scan on iot_device de (cost=0.00..2.25 rows=1 width=7) (never executed)
> Filter: (name = '50a'::text)
> Planning Time: 30.429 ms
> Execution Time: 22309.364 ms
> (20 rows)

From what I can work out, the DDL you used here is:

-- you didn't seem to specify the DDL for iot_device, so I used:
create table iot_device (
id bigint primary key,
name text not null
);

insert into iot_device select x,x::Text || 'a' from generate_Series(1,100) x;

create table iot_data(id serial ,data text,metadata bigint,device
bigint references iot_device(id),primary key(id,device)) partition by
hash(device);
create table iot_data_0 partition of iot_data for values with (MODULUS
3, remainder 0);
create table iot_data_1 partition of iot_data for values with (MODULUS
3, remainder 1);
create table iot_data_2 partition of iot_data for values with (MODULUS
3, remainder 2);

insert into iot_data select
generate_series(1,10000000),random()*10,random()*254,random()*99+1;
create index on iot_data(metadata,lower(data));
vacuum analyze iot_data;

In which case, you're getting a pretty different plan than I am. (I
admit that I only tried on current master and not PG12.2, however, I
see no reason that PG12.2 shouldn't produce the same plan)

I get:

# explain analyze select * from iot_data da,iot_device de where
de.name in ('50a') and de.id = da.device and da.metadata=50 and
lower(da.data) like '50';

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.56..28.04 rows=1 width=49) (actual
time=0.058..0.058 rows=0 loops=1)
Join Filter: (da.device = de.id)
-> Seq Scan on iot_device de (cost=0.00..2.25 rows=1 width=11)
(actual time=0.013..0.016 rows=1 loops=1)
Filter: (name = '50a'::text)
Rows Removed by Filter: 99
-> Append (cost=0.56..25.76 rows=3 width=38) (actual
time=0.040..0.040 rows=0 loops=1)
-> Index Scan using iot_data_0_metadata_lower_idx on
iot_data_0 da_1 (cost=0.56..8.58 rows=1 width=38) (actual
time=0.020..0.020 rows=0 loops=1)
Index Cond: ((metadata = 50) AND (lower(data) = '50'::text))
Filter: (lower(data) ~~ '50'::text)
-> Index Scan using iot_data_1_metadata_lower_idx on
iot_data_1 da_2 (cost=0.56..8.58 rows=1 width=38) (actual
time=0.010..0.010 rows=0 loops=1)
Index Cond: ((metadata = 50) AND (lower(data) = '50'::text))
Filter: (lower(data) ~~ '50'::text)
-> Index Scan using iot_data_2_metadata_lower_idx on
iot_data_2 da_3 (cost=0.56..8.58 rows=1 width=38) (actual
time=0.009..0.009 rows=0 loops=1)
Index Cond: ((metadata = 50) AND (lower(data) = '50'::text))
Filter: (lower(data) ~~ '50'::text)
Planning Time: 0.280 ms
Execution Time: 0.094 ms
(17 rows)

Are you certain that you added an index on iot_data (metadata, lower(data)); ?

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Michael 2020-03-11 19:31:48 Slow ext'd query via client native implementation vs. libpq & simple psql
Previous Message Mariel Cherkassky 2020-03-09 13:16:47 Re: pg12 partitions show bad performance vs pg96