Re: BitmapHeapScan streaming read user and prelim refactoring

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2025-03-12 00:07:38
Message-ID: ihb3ggccagv34lpm4ninuq5gkvhi5fuikdzr6677spevp7fxlu@bj6ryi4jkjf4
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-03-10 19:45:38 -0400, Melanie Plageman wrote:
> From 7b35b1144bddf202fb4d56a9b783751a0945ba0e Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> Date: Mon, 10 Mar 2025 17:17:38 -0400
> Subject: [PATCH v35 1/5] Increase default effective_io_concurrency to 16
>
> The default effective_io_concurrency has been 1 since it was introduced
> in b7b8f0b6096d2ab6e. Referencing the associated discussion [1], it
> seems 1 was chosen as a conservative value that seemed unlikely to cause
> regressions.

16 years...

> Experimentation on high latency cloud storage as well as fast, local
> nvme storage (see Discussion link) shows that even slightly higher values
> improve query timings substantially. 1 actually performs worse than 0.
> With effective_io_concurrency 1, we are not prefetching enough to avoid
> I/O stalls, but we are issuing extra syscalls.

Makes sense.

> Moreover, when bitmap heap scan is converted to using the read stream
> API, a prefetch distance of 1 will prevent read combining which is quite
> detrimental to performance.

Hm? This one surprises me. Doesn't the read stream code take some pains to
still perform IO combining when effective_io_concurrency=1? It does work for
seqscans, for example?

> The new default is 16, which should be more appropriate in the general
> case while still avoiding flooding low IOPs devices with I/O requests.

Maybe s/in the general case/for common hardware/?

> [1] https://www.postgresql.org/message-id/flat/FDDBA24E-FF4D-4654-BA75-692B3BA71B97%40enterprisedb.com
>
> Discussion: https://postgr.es/m/CAAKRu_Z%2BJa-mwXebOoOERMMUMvJeRhzTjad4dSThxG0JLXESxw%40mail.gmail.com
> ---
> doc/src/sgml/config.sgml | 38 +++++++++----------
> src/backend/utils/misc/postgresql.conf.sample | 2 +-
> src/include/storage/bufmgr.h | 2 +-
> 3 files changed, 19 insertions(+), 23 deletions(-)
>
> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index d2fa5f7d1a9..8c4409fc8bf 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -2577,36 +2577,32 @@ include_dir 'conf.d'
> Sets the number of concurrent disk I/O operations that
> <productname>PostgreSQL</productname> expects can be executed
> simultaneously. Raising this value will increase the number of I/O
> - operations that any individual <productname>PostgreSQL</productname> session
> - attempts to initiate in parallel. The allowed range is 1 to 1000,
> - or zero to disable issuance of asynchronous I/O requests. Currently,
> - this setting only affects bitmap heap scans.
> + operations that any individual <productname>PostgreSQL</productname>
> + session attempts to initiate in parallel. The allowed range is
> + <literal>1</literal> to <literal>1000</literal>, or
> + <literal>0</literal> to disable issuance of asynchronous I/O requests.
> + The default is <literal>16</literal> on supported systems, otherwise
> + <literal>0</literal>. Currently, this setting only affects bitmap heap
> + scans.
> </para>

I'd probably use this as an occasion to remove "Currently, this setting only
affects bitmap heap" sentence - afaict it's been wrong for a while and got
more wrong since vacuum started to use read streams...

> <para>
> - For magnetic drives, a good starting point for this setting is the
> - number of separate
> - drives comprising a RAID 0 stripe or RAID 1 mirror being used for the
> - database. (For RAID 5 the parity drive should not be counted.)
> - However, if the database is often busy with multiple queries issued in
> - concurrent sessions, lower values may be sufficient to keep the disk
> - array busy. A value higher than needed to keep the disks busy will
> - only result in extra CPU overhead.
> - SSDs and other memory-based storage can often process many
> - concurrent requests, so the best value might be in the hundreds.

Afaict this whole paragraph was *never* correct... Obviously that's not
criticism of your removing it ;)

> + Higher values will have the most impact on higher latency storage
> + where queries otherwise experience noticeable I/O stalls and on
> + devices with high IOPs. Higher values than needed to satisfy the query
> + or keep the device busy can be expected to only introduce extra CPU
> + overhead.
> </para>

I'd say unnecessarily high values also can increase IO latency.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Maciek Sakrejda 2025-03-12 00:41:28 Re: Question about duplicate JSONTYPE_JSON check
Previous Message Andres Freund 2025-03-11 23:55:35 Re: AIO v2.5