Finalizing read stream users' flag choices

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Finalizing read stream users' flag choices
Date: 2025-04-08 16:06:47
Message-ID: CAAKRu_aopDxTo4b41Mt_7Zc-z0_ngocrY8SFCCY6Aph1HgwuNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Over the course of the last two releases, we have added many read
stream users. Each user specifies any number of flags (defined at the
top of read_stream.h) which govern different aspects of the read
stream behavior.

There are a few inconsistencies (many caused by me) that I want to
iron out and gain consensus on.

The first is whether maintenance_io_concurerency or
effective_io_concurrency affects the readahead distance.

We've said before that maintenance_io_concurrency should govern work
done on behalf of many different sessions. That was said to include at
least vacuum and recovery. I need to change the index vacuum users to
use READ_STREAM_MAINTENANCE. But I wonder about the other users like
amcheck and autoprewarm.

Another related question is if/how we should document which of these
are controlled by effective_io_concurrency or
maintenance_io_concurrency.

The second is related to how they ramp up the size of IOs and the
number read ahead:
READ_STREAM_DEFAULT ramps up the prefetch distance gradually.
READ_STREAM_FULL starts at full distance immediately

Some of the users specify DEFAULT and others don't (it is defined as 0
so this is fine technically). Perhaps that should be explicit for all
of them? Separately, Thomas Munro has mentioned he thinks we should
remove READ_STREAM_FULL.

And a somewhat related point, with buffered IO, READ_STREAM_SEQUENTIAL
disables prefetching to encourage OS readahead. I don't know if any
other users than sequential scan should do this.

Other than the obvious issue with index vacuuming read stream users
needing to set READ_STREAM_MAINTENANCE, the other questions are
subjective. Below are all of the read stream users in master and their
current flags.

sequential scan: READ_STREAM_SEQUENTIAL | READ_STREAM_USE_BATCHING

bitmap heap scan: READ_STREAM_DEFAULT | READ_STREAM_USE_BATCHING

phase I heap vacuum: READ_STREAM_MAINTENANCE

phase II index vacuuming:
btree index vacuuming: READ_STREAM_FULL | READ_STREAM_USE_BATCHING
spgist vacuum: READ_STREAM_FULL | READ_STREAM_USE_BATCHING
gist vacuum: READ_STREAM_FULL | READ_STREAM_USE_BATCHING

phase III heap vacuuming: READ_STREAM_MAINTENANCE | READ_STREAM_USE_BATCHING

analyze: READ_STREAM_MAINTENANCE | READ_STREAM_USE_BATCHING

amcheck:
with skipping: READ_STREAM_DEFAULT
without skipping: READ_STREAM_SEQUENTIAL | READ_STREAM_FULL |
READ_STREAM_USE_BATCHING;

autoprewarm: READ_STREAM_DEFAULT | READ_STREAM_USE_BATCHING

pg_prewarm: READ_STREAM_FULL | READ_STREAM_USE_BATCHING

pg_visibility:
collect visibility data: READ_STREAM_FULL | READ_STREAM_USE_BATCHING
collect corrupt items: READ_STREAM_FULL

createdb: READ_STREAM_FULL | READ_STREAM_USE_BATCHING

- Melanie

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2025-04-08 16:11:02 Re: Feature freeze
Previous Message Joe Conway 2025-04-08 16:02:46 Re: Feature freeze