From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Pankaj Raghav <kernel(at)pankajraghav(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, mcgrof(at)kernel(dot)org, gost(dot)dev(at)samsung(dot)com |
Subject: | Re: Large block sizes support in Linux |
Date: | 2024-03-25 20:19:00 |
Message-ID: | ZgHcNGJVfE7-UkAG@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 25, 2024 at 02:53:56PM +0100, Pankaj Raghav wrote:
> This is an excellent question that needs a bit of community discussion to
> expose a device agnostic value that userspace can trust.
>
> There might be a talk this year at LSFMM about untorn writes[1] in buffered IO
> path. I will make sure to bring this question up.
>
> At the moment, Linux exposes the physical blocksize by taking also atomic guarantees
> into the picture, especially for NVMe it uses the NAWUPF and AWUPF while setting
> physical blocksize (/sys/block/<dev>/queue/physical_block_size).
>
> A system admin could use value exposed by phy_bs as a hint to disable full_page_write=off.
> Of course this requires also the device to give atomic guarantees.
>
> The most optimal would be DB page size == FS block size == Device atomic size.
One other thing I remember is that some people modified the ZFS file
system parameters enough that they made Postgres non-durable and
corrupted their database. This is a very hard thing to get right
because the user has very little feedback when they break things.
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2024-03-25 20:20:42 | Re: pgsql: Track last_inactive_time in pg_replication_slots. |
Previous Message | Nathan Bossart | 2024-03-25 20:05:51 | Re: Popcount optimization using AVX512 |