Re: Vectored I/O in bulk_write.c

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Vectored I/O in bulk_write.c
Date: 2024-03-12 21:38:52
Message-ID: CA+hUKGJ32VBM+FAeizrkpsC0vAzRuX7o=seT5m=-quMyj=9zvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

One more observation while I'm thinking about bulk_write.c... hmm, it
writes the data out and asks the checkpointer to fsync it, but doesn't
call smgrwriteback(). I assume that means that on Linux the physical
writeback sometimes won't happen until the checkpointer eventually
calls fsync() sequentially, one segment file at a time. I see that
it's difficult to decide how to do that though; unlike checkpoints,
which have rate control/spreading, bulk writes could more easily flood
the I/O subsystem in a burst. I expect most non-Linux systems'
write-behind heuristics to fire up for bulk sequential writes, but on
Linux where most users live, there is no write-behind heuristic AFAIK
(on the most common file systems anyway), so you have to crank that
handle if you want it to wake up and smell the coffee before it hits
internal limits, but then you have to decide how fast to crank it.

This problem will come into closer focus when we start talking about
streaming writes. For the current non-streaming bulk_write.c coding,
I don't have any particular idea of what to do about that, so I'm just
noting the observation here.

Sorry for the sudden wall of text/monologue; this is all a sort of
reaction to bulk_write.c that I should perhaps have sent to the
bulk_write.c thread, triggered by a couple of debugging sessions.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-03-12 21:53:40 Recent 027_streaming_regress.pl hangs
Previous Message Tom Lane 2024-03-12 21:18:09 Re: On disable_cost