| From: | Andres Freund <andres(at)anarazel(dot)de> | 
|---|---|
| To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> | 
| Cc: | David Christensen <david(dot)christensen(at)crunchydata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net> | 
| Subject: | Re: Initdb-time block size specification | 
| Date: | 2023-06-30 22:20:38 | 
| Message-ID: | 20230630222038.itet32chzwjf6gc6@awork3.anarazel.de | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
On 2023-06-30 23:42:30 +0200, Tomas Vondra wrote:
> I wonder what are the conditions/options for disabling FPI. I kinda
> assume it'd apply to new drives with 4k sectors, with properly aligned
> partitions etc. But I haven't seen any particularly clear confirmation
> that's correct.
Yea, it's not trivial. And the downsides are also substantial from a
replication / crash recovery performance POV - even with reading blocks ahead
of WAL replay, it's hard to beat just sequentially reading nearly all the data
you're going to need.
> On 6/30/23 23:11, Andres Freund wrote:
> > If we really wanted to do this - but I don't think we do - I'd argue for
> > working on the buildsystem support to build the postgres binary multiple
> > times, for 4, 8, 16 kB BLCKSZ and having a wrapper postgres binary that just
> > exec's the relevant "real" binary based on the pg_control value.  I really
> > don't see us ever wanting to make BLCKSZ runtime configurable within one
> > postgres binary. There's just too much intrinsic overhead associated with
> > that.
>
> How would that work for extensions which may be built for a particular
> BLCKSZ value (like pageinspect)?
I think we'd need to do something similar for extensions, likely loading them
from a path that includes the "subvariant" the server currently is running. Or
alternatively adding a suffix to the filename indicating the
variant. Something like pageinspect.x86-64-v4-4kB.so.
The x86-64-v* stuff is something gcc and clang added a couple years ago, so
that not every project has to define different "baseline" levels. I think it
was done in collaboration with the sytem-v/linux AMD64 ABI specification group
([1]).
Greetings,
Andres
[1] https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build
section 3.1.1.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2023-06-30 22:21:03 | Re: Initdb-time block size specification | 
| Previous Message | Nikolay Samokhvalov | 2023-06-30 22:18:03 | Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption? |