Re: Large block sizes support in Linux

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Pankaj Raghav (Samsung)" <kernel(at)pankajraghav(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, p(dot)raghav(at)samsung(dot)com, mcgrof(at)kernel(dot)org, gost(dot)dev(at)samsung(dot)com
Subject: Re: Large block sizes support in Linux
Date: 2024-03-22 18:46:05
Message-ID: Zf3R7YOcuHncq6cq@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 21, 2024 at 06:46:19PM +0100, Pankaj Raghav (Samsung) wrote:
> Hello,
>
> My team and I have been working on adding Large block size(LBS)
> support to XFS in Linux[1]. Once this feature lands upstream, we will be
> able to create XFS with FS block size > page size of the system on Linux.
> We also gave a talk about it in Linux Plumbers conference recently[2]
> for more context. The initial support is only for XFS but more FSs will
> follow later.
>
> On an x86_64 system, fs block size was limited to 4k, but traditionally
> Postgres uses 8k as its default internal page size. With LBS support,
> fs block size can be set to 8K, thereby matching the Postgres page size.
>
> If the file system block size == DB page size, then Postgres can have
> guarantees that a single DB page will be written as a single unit during
> kernel write back and not split.
>
> My knowledge of Postgres internals is limited, so I'm wondering if there
> are any optimizations or potential optimizations that Postgres could
> leverage once we have LBS support on Linux?

We have discussed this in the past, and in fact in the early years we
thought we didn't need fsync since the BSD file system was 8k at the
time.

What we later realized is that we have no guarantee that the file system
will write to the device in the specified block size, and even it it
does, the I/O layers between the OS and the device might not, since many
devices use 512 byte blocks or other sizes.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2024-03-22 18:59:44 Re: documentation structure
Previous Message Robert Haas 2024-03-22 18:40:12 Re: pg_upgrade --copy-file-range