RE: How to solve my slow disk i/o throughput during index scan

From: "FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)" <simon(dot)freyburger(at)sncf(dot)fr>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: RE: How to solve my slow disk i/o throughput during index scan
Date: 2024-07-04 15:23:23
Message-ID: MR1P264MB34739538259E7FAF3A9DE4249DDE2@MR1P264MB3473.FRAP264.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

Thank you, splitting in “OR” query definitely enables bitmap heap scans, and thus parallelized read to disk 😃 ! I though did not understand your second point, what is parallel append, and how to enable it ?

Simon F.

Interne

De : Andrei Lepikhov <lepihov(at)gmail(dot)com>
Envoyé : jeudi 4 juillet 2024 16:37
À : FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT) <simon(dot)freyburger(at)sncf(dot)fr>; pgsql-performance(at)lists(dot)postgresql(dot)org; Peter Geoghegan <pg(at)bowt(dot)ie>
Objet : Re: How to solve my slow disk i/o throughput during index scan

On 4/7/2024 20: 25, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT) wrote: > *My question is : what can I change to get a better index reading speed ?* > > What I already tried : > > * Setting

On 4/7/2024 20:25, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE

TGV / DM RMP YIELD MANAGEMENT) wrote:

> *My question is : what can I change to get a better index reading speed ?*

>

> What I already tried :

>

> * Setting random_page_cost to prohibitive value (10000000) to force a

> bitmap heap scan, because those can be made in parallel. This has

> not worked, the optimizer is still doing an index scan on my fact table.

> * Change effective_io_concurrency, max_parallel_workers_per_gather,

> work_mem to much higher values.

I'm not sure the case is only about speed of index scan. Just see into

slow Index clause:

fth.\"ID_TRAIN\" = ANY ('{17855,13945,536795,18838,18837,13574 ...

and many more values.

IndexScan need to make scan for each of these values and for each value

go through the pages to check other conditions.

We already discuss some optimisations related to this case in couple of

pgsql-hackers threads. But I'm not sure we have quick solution right now.

If you want to use BitmapScan (that might be reasonable to try here) -

you need to split huge ANY (...) clause into sequence of ORs.

Also, may be parallel append could help here? if can change

corresponding startup and tuple costs to force such a plan.

--

regards, Andrei Lepikhov

-------
Ce message et toutes les pièces jointes sont établis à l'intention exclusive de ses destinataires et sont confidentiels. L'intégrité de ce message n'étant pas assurée sur Internet, la SNCF ne peut être tenue responsable des altérations qui pourraient se produire sur son contenu. Toute publication, utilisation, reproduction, ou diffusion, même partielle, non autorisée préalablement par la SNCF, est strictement interdite. Si vous n'êtes pas le destinataire de ce message, merci d'en avertir immédiatement l'expéditeur et de le détruire.
-------
This message and any attachments are intended solely for the addressees and are confidential. SNCF may not be held responsible for their contents whose accuracy and completeness cannot be guaranteed over the Internet. Unauthorized use, disclosure, distribution, copying, or any part thereof is strictly prohibited. If you are not the intended recipient of this message, please notify the sender immediately and delete it.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message James Pang 2024-07-05 00:50:18 Re: Hash Right join and seq scan
Previous Message Andrei Lepikhov 2024-07-04 14:36:59 Re: How to solve my slow disk i/o throughput during index scan