From: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
---|---|
To: | Alex Kaiser <alextkaiser(at)gmail(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Getting an index scan to be a parallel index scan |
Date: | 2023-02-01 11:17:17 |
Message-ID: | CAEudQAqisJrxGsHzekOFHwXAscqb3K_AVrO9d8F9sL8QbQKVEA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Em qua., 1 de fev. de 2023 às 02:39, Alex Kaiser <alextkaiser(at)gmail(dot)com>
escreveu:
> Hello,
>
> I'm trying to get the following query to use a plan with parallelism, but
> I haven't been successful and would like some advice.
>
> The schema and table that I'm using is this:
>
> CREATE TABLE testing(
> id INT,
> info INT,
> data_one TEXT,
> data_two TEXT,
> primary key(id, info)
> );
>
> INSERT INTO testing(id, info, data_one, data_two)
> SELECT idx, idx, md5(random()::text), md5(random()::text)
> FROM generate_series(1,10000000) idx;
>
> Then the query that I'm trying to run is this (I'll include the full query
> at the very end of the email because it is long:
>
> select * from testing where id in (1608377,5449811, ... <1000 random ids>
> ,4654284,3558460);
>
> Essentially I have a list of 1000 ids and I would like the rows for all of
> those ids.
>
> This seems like it would be pretty easy to parallelize, if you have X
> threads then you would split the list of IDs into 1000/X sub lists and give
> one to each thread to go find the rows for ids in the given list. Even
> when I use the following configs I don't get a query plan that actually
> uses any parallelism:
>
> psql (15.1 (Debian 15.1-1.pgdg110+1))
> Type "help" for help.
>
> postgres=# show max_parallel_workers;
> max_parallel_workers
> ----------------------
> 8
> (1 row)
>
> postgres=# set max_parallel_workers_per_gather = 8;
> SET
> postgres=# set parallel_setup_cost = 0;
> SET
> postgres=# set parallel_tuple_cost = 0;
> SET
> postgres=# set force_parallel_mode = on;
> SET
> postgres=# explain select * from testing where id in (1608377,5449811, ...
> <removed for brevity> ... ,4654284,3558460);
>
Can you try:
select * from testing where id any = (values(1608377),(5449811),(5334677)
... <removed for brevity> ... ,(4654284),(3558460));
Or alternately you can use EXTEND STATISTICS to improve Postgres planner
choice.
regards,
Ranier Vilela
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-02-01 11:30:31 | Re: Getting an index scan to be a parallel index scan |
Previous Message | Alex Kaiser | 2023-02-01 05:39:06 | Getting an index scan to be a parallel index scan |