Re: Extensible storage manager API - SMGR hook Redux

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Tristan Partin <tristan(at)neon(dot)tech>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, zsolt(dot)parragi(at)cancellar(dot)hu, nitinjadhavpostgres(at)gmail(dot)com, gongxun0928(at)gmail(dot)com
Subject: Re: Extensible storage manager API - SMGR hook Redux
Date: 2025-02-03 11:27:23
Message-ID: cd68e15f-35b3-4dea-bd62-cf88eeaa0fbe@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

We at Percona are very interested in this patch for our transparent data
encryption extension. So we would love to collaborate with you, and
anyone else interested, on making the SMGR extensible.

I have attached rebased and a bit cleaned up versions of Tristan's
patches plus a couple of patches we have been working on in-house
(mainly my colleague Zsolt). I also have some questions which I would
like to discuss.

0001-0004

The same patches as Tristan posted but rebased and cleaned up a bit to
better follow the code style. I also removed a couple of dead variables
which seemed like left overs.

0005

Since we support having both encrypted and unencrypted relations we use
the RelFileLocator to look up if a relation is encrypted. And to
preserve that information when smgrcreate() creates a new relfile for a
relation we pass along the old RelFileLocator.

For our use case it is possible that we could solve this in other ways.
For example if we decide to go with configuring the SMGR per schema this
will probably not be necessary at all.

0006

The patch introduces the concept of "chaining" SMGRs where we have tail
(e.g. md or a theoretical Ceph SMGR) and modifier (e.g. TDE or the
fsync_checker). Something like this would be useful for our case since
it would be nice to be able to use the same encryption code for md and
for some other potential replacement for md which uses some kind object
storage for example.

As a bonus this allowed us to make the functions implementing md static.

It is currently controlled via a GUC, smgr_chain, but this will of
course depend on how we decide to implement configuring which SMGR to use.

Questions

- What is up with the barrier when loading SMGRs? That does not seem
necessary or am I missing something? I believe Andres also spotted this.

- How should we configure which SMGR to use for each relation? People
have talked about doing it per tablespace or using hooks and we have a
patch which uses a GUC for this. I have personally not researched these
options enough to have an opinion yet.

- Is our idea about chaining SMGRs useful? In its current form or some
variant inspired by it?

- We need to benchmark this to make sure we do not introduce too much
overhead, especially for people who just want to use md. I saw for
example that Andres had some complaint about extra indirection which we
may have to address.

Andreas

Attachment Content-Type Size
v3-0001-Expose-f_smgr-to-extensions-for-manual-implementa.patch text/x-patch 33.7 KB
v3-0002-Allow-extensions-to-override-the-global-storage-m.patch text/x-patch 2.0 KB
v3-0003-Add-checkpoint_create_hook.patch text/x-patch 1.9 KB
v3-0004-Add-contrib-fsync_checker.patch text/x-patch 9.9 KB
v3-0005-Refactor-smgr-API-mdcreate-needs-the-old-relfilel.patch text/x-patch 13.3 KB
v3-0006-SMGR-GUC-variable-and-chaining.patch text/x-patch 47.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2025-02-03 11:32:10 Re: POC, WIP: OR-clause support for indexes
Previous Message Peter Eisentraut 2025-02-03 11:22:52 new commitfest transition guidance