Re: BitmapHeapScan streaming read user and prelim refactoring

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2025-02-14 17:14:20
Message-ID: fy3kx3gnrruys2fxmcxhxzmdxft2ryqgrwivqoowczz3d2dljw@tab26wmjfkgi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-02-14 10:04:41 +0100, Jakub Wartak wrote:
> Is there any reason we couldn't have new pg_test_iorates (similiar to
> other pg_test_* proggies), that would literally do this and calibrate
> best e_io_c during initdb and put the result into postgresql.auto.conf
> (pg_test_iorates --adjust-auto-conf) , that way we would avoid user
> questions on how to come with optimal value?

Unfortunately I think this is a lot easier said than done:

- The optimal depth depends a lot on your IO patterns, there's a lot of things
between fully sequential and fully random.

You'd really need to test different IO sizes and different patterns. The
test matrix for that gets pretty big.

- The performance characteristics of storage heavily changes over time.

This is particularly true in cloud environments, where disk/VM combinations
will be "burstable", allowing higher throughput for a while, but then not
anymore. Measureing during either of those states will not be great for the
other state.

- e_io_c is per-query-node, but impacts the whole system. If you set
e_io_c=1000 on a disk with a metered IOPS of say 1k/s you might get a
slightly higher throughput for a bitmap heap scan, but also your commit
latency in concurrent will go through the roof, because your fdatasync()
will be behind a queue of 1k reads that all are throttled.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-02-14 17:14:55 Re: BackgroundPsql swallowing errors on windows
Previous Message Andres Freund 2025-02-14 17:06:33 Re: Allow io_combine_limit up to 1MB