Re: BUG #18588: Cannot force/let database use parallel execution in simple case.

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18588: Cannot force/let database use parallel execution in simple case.
Date: 2024-08-22 16:41:37
Message-ID: CAK-MWwTqDRerxrpaN8Xb25DqDGBw1UT+YSkfp4QN-MgOH97c+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Aug 22, 2024 at 5:34 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Fri, 23 Aug 2024 at 01:14, PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
> > set min_parallel_index_scan_size to '8kB';
> >
> > set min_parallel_table_scan_size to '8kB';
> > set parallel_tuple_cost to 0;
> > set parallel_setup_cost to 0;
> >
> > have no effect.
> >
> > Even with the set force_parallel_mode to on - no effect:
> > negotiation_chat_archive=# explain analyze select topic_id as y0_ from
> > public.negotiation_topic_archive_p005 this_ where
> > this_.employer_id='816144';
>
> Does it choose a parallel plan if you do:
>
> ALTER TABLE public.negotiation_topic_archive_p005 SET (parallel_workers = 2);
>
> I assume it must be a btree index given the other plan does use a
> parallel scan. So I wondered if something weird was happening in
> compute_parallel_worker() and it was choosing 1 worker, which might
> cause the path to be rejected.
>
> You might need to come up with a self-contained test case here as when
> I tried this on PG15, I do get a parallel index scan.
>
> create table t1 (a int);
> insert into t1 select x/1000 from generate_Series(1,1000000)X;
> create index on t1(a);
> set min_parallel_index_scan_size to '8kB';
> set min_parallel_table_scan_size to '8kB';
> set parallel_tuple_cost to 0;
> set parallel_setup_cost to 0;
> explain select * from t1 where a < 10;
> QUERY PLAN
> --------------------------------------------------------------------------------------------
> Gather (cost=0.42..179.40 rows=8912 width=4)
> Workers Planned: 2
> -> Parallel Index Only Scan using t1_a_idx on t1
> (cost=0.42..179.40 rows=3713 width=4)
> Index Cond: (a < 10)
> (4 rows)
>
> select version();
> version
> -------------------------------------------------------------------------------------------------------
> PostgreSQL 15.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
> (1 row)
>
> David

Hi David,

Some additional info about table:
\dt+ negotiation_topic_archive_p005
List of relations
Schema | Name | Type | Owner |
Persistence | Access method | Size | Description
--------+--------------------------------+-------+----------+-------------+---------------+--------+-------------
public | negotiation_topic_archive_p005 | table | postgres |
permanent | heap | 112 GB |

select relpages from pg_class where relname='negotiation_topic_archive_p005';
relpages
----------
14688625

select count(*) from negotiation_topic_archive_p005;
count
-----------
499710323
Quite big table, but nothing unusual for contemporary hardware.
Hm integer(?) overflow somewhere in planner costing model of parallel scans?
I'll looked into compute_parallel_worker code (
https://doxygen.postgresql.org/paths_8h.html#a64fa1e38a91614994697a7404cea242a
) but I didn't found anything obviously wrong for me there (but my C
knowledge pretty limited).

There are some results of different tests:
0. ALTER TABLE public.negotiation_topic_archive_p005 SET
(parallel_workers = 2); - have no effect

1. Create fresh table on same db with only required data (local
minimal test case):
Unfortunately - no luck:

--trying make miminal isolated case
create table user_tmp.negotiation_topic_archive_test as select
employer_id, applicant_id from negotiation_topic_archive_p005;
SELECT 499710323

create index negotiation_topic_archive_test_employer_id_key on
user_tmp.negotiation_topic_archive_test(employer_id);
CREATE INDEX

analyze verbose user_tmp.negotiation_topic_archive_test;
INFO: analyzing "user_tmp.negotiation_topic_archive_test"
INFO: "negotiation_topic_archive_test": scanned 30000 of 2211108
pages, containing 6780000 live rows and 0 dead rows; 30000 rows in
sample, 499710408 estimated total rows
ANALYZE

select relpages from pg_class where relname='negotiation_topic_archive_test';
relpages
----------
2211108

--no luck
explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using negotiation_topic_archive_test_employer_id_key on
negotiation_topic_archive_test this_ (cost=0.11..2047834.92
rows=449739 width=8)
Index Cond: (employer_id = 816144)

--still no luck
set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size to 0;
set parallel_tuple_cost to 0;
set parallel_setup_cost to 0;
set force_parallel_mode to on;

explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.11..2047834.92 rows=449739 width=8)
Workers Planned: 1
Single Copy: true
-> Index Scan using negotiation_topic_archive_test_employer_id_key
on negotiation_topic_archive_test this_ (cost=0.11..2047834.92
rows=449739 width=8)
Index Cond: (employer_id = 816144)

--with not exists - still ok
explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144
AND NOT EXISTS (
SELECT FROM applicant_black_list
WHERE this_.employer_id=applicant_black_list.employer_id
AND this_.applicant_id=applicant_black_list.applicant_id
);

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=10.27..2045714.44 rows=449737 width=8)
Workers Planned: 4
-> Parallel Hash Anti Join (cost=10.27..2045714.44 rows=112434 width=8)
Hash Cond: ((this_.employer_id =
applicant_black_list.employer_id) AND (this_.applicant_id =
applicant_black_list.applicant_id))
-> Parallel Index Scan using
negotiation_topic_archive_test_employer_id_key on
negotiation_topic_archive_test this_ (cost=0.11..2044461.88
rows=112435 width=8)
Index Cond: (employer_id = 816144)
-> Parallel Hash (cost=10.13..10.13 rows=2 width=8)
-> Parallel Index Only Scan using
employer_to_user_index on applicant_black_list (cost=0.08..10.13
rows=2 width=8)
Index Cond: (employer_id = 816144)

So isolated tests on table without inheritance parents or some
legacy/forgotten custom settings - show the same results.

2.Now lets try dump/reload to the test host with Postgresql 16.4 on
stock pgdg default postgresql.conf:
pg_dump -Fc --compress=2
--table=user_tmp.negotiation_topic_archive_test
--table=applicant_black_list db > negotiation_topic_archive_test.dump

ls -la negotiation_topic_archive_test.dump
-rw-r--r-- 1 postgres postgres 4028520703 Aug 22 19:10
negotiation_topic_archive_test.dump

--on test host... and still no luck
pg_restore -Fc --port=5433 --dbname=test negotiation_topic_archive_test.dump

test=# analyze applicant_black_list;
ANALYZE
test=# analyze user_tmp.negotiation_topic_archive_test;
ANALYZE

set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size to 0;
set parallel_tuple_cost to 0;
set parallel_setup_cost to 0;
set debug_parallel_query to on;
explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.57..1529820.39 rows=416420 width=8)
Workers Planned: 1
Single Copy: true
-> Index Scan using negotiation_topic_archive_test_employer_id_key
on negotiation_topic_archive_test this_ (cost=0.57..1529820.39
rows=416420 width=8)
Index Cond: (employer_id = 816144)

In the same time modified youre test case on the same 16.4 test db -
test case seems ok even with large dataset:
insert into t1 select x/10000 from generate_Series(1,100000000)X;
create index on t1(a);
analyze t1;
set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size to 0;
set parallel_tuple_cost to 0;
set parallel_setup_cost to 0;
set debug_parallel_query to on;
explain select * from t1 where a=10;
explain select * from t1 where a=10;
QUERY PLAN
---------------------------------------------------------------------------------------------
Gather (cost=0.57..4636.40 rows=10786 width=4)
Workers Planned: 2
-> Parallel Index Only Scan using t1_a_idx on t1
(cost=0.57..4636.40 rows=4494 width=4)
Index Cond: (a = 10)

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrei Lepikhov 2024-08-22 17:00:37 Re: BUG #18588: Cannot force/let database use parallel execution in simple case.
Previous Message Kuntal Ghosh 2024-08-22 15:14:22 Re: BUG #18559: Crash after detaching a partition concurrently from another session