Re: Extensible storage manager API - SMGR hook Redux

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: Tristan Partin <tristan(at)neon(dot)tech>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Extensible storage manager API - SMGR hook Redux
Date: 2024-09-21 18:24:58
Message-ID: CAMm1aWakiMa1LK6NQN8BG-mamBhpTaoOOZwV3DhHyQ3ZtDeFAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I reviewed the discussion and took a look at the patch sets. It seems
like many things are combined here. Based on the subject, I initially
thought it aimed to provide the infrastructure to easily extend
storage managers. This would allow anyone to create their own storage
managers using this infrastructure. While it addresses this, it also
includes additional features like fsync_checker, which I believe
should be a separate feature. Even though it might use the same
infrastructure, it appears to be a different functionality. I think we
should focus solely on providing the infrastructure here.

We need to decide on our approach—whether to use a hook-based method
or a registration-based method—and I believe this requires further
discussion.

The hook-based approach is simple and works well for anyone writing
their own storage manager. However, it has its limitations as we can
either use the default storage manager or a custom-built one for all
the work load, but we cannot choose between multiple storage managers.
On the other hand, the registration-based approach allows choosing
between multiple storage managers based on the workload, though it
requires a lot of changes.

Are we planning to support other storage managers in PostgreSQL in the
near future? If not, it is better to go with the hook-based approach.
Otherwise, the registration-based approach is preferable as it offers
more flexibility to users and enhances PostgreSQL’s functionality.

Could you please share your thoughts on this? Also, let me know if
this topic has already been discussed and if any conclusions were
reached.

Best Regards,
Nitin Jadhav
Azure Database for PostgreSQL
Microsoft

On Sat, Jan 13, 2024 at 1:27 AM Tristan Partin <tristan(at)neon(dot)tech> wrote:
>
> Thought I would show off what is possible with this patchset.
>
> Heikki, a couple of months ago in our internal Slack, said:
>
> > [I would like] a debugging tool that checks that we're not missing any
> > fsyncs. I bumped into a few missing fsync bugs with unlogged tables
> > lately and a tool like that would've been very helpful.
>
> My task was to create such a tool, and off I went. I started with the
> storage manager extension patch that Matthias sent to the list last
> year[0].
>
> Andres, in another thread[1], said:
>
> > I've been thinking that we need a validation layer for fsyncs, it's too hard
> > to get right without testing, and crash testing is not likel enough to catch
> > problems quickly / resource intensive.
> >
> > My thought was that we could keep a shared hash table of all files created /
> > dirtied at the fd layer, with the filename as key and the value the current
> > LSN. We'd delete files from it when they're fsynced. At checkpoints we go
> > through the list and see if there's any files from before the redo that aren't
> > yet fsynced. All of this in assert builds only, of course.
>
> I took this idea and ran with it. I call it the fsync_checker™️. It is an
> extension that prints relations that haven't been fsynced prior to
> a CHECKPOINT. Note that this idea doesn't work in practice because
> relations might not be fsynced, but they might be WAL-logged, like in
> the case of createdb. See log_smgrcreate(). I can't think of an easy way
> to solve this problem looking at the codebase as it stands.
>
> Here is a description of the patches:
>
> 0001:
>
> This is essentially just the patch that Matthias posted earlier, but
> rebased and adjusted a little bit so storage managers can "inherit" from
> other storage managers.
>
> 0002:
>
> This is an extension of 0001, which allows for extensions to set
> a global storage manager. This is pretty hacky, and if it was going to
> be pulled into mainline, it would need some better protection. For
> instance, only one extension should be able to set the global storage
> manager. We wouldn't want extensions stepping over each other, etc.
>
> 0003:
>
> Adds a hook for extensions to inspect a checkpoint before it actually
> occurs. The purpose for the fsync_checker is so that I can iterate over
> all the relations the extension tracks to find files that haven't been
> synced prior to the completion of the checkpoint.
>
> 0004:
>
> This is the actual fsync_checker extension itself. It must be preloaded.
>
> Hopefully this is a good illustration of how the initial patch could be
> used, even though it isn't perfect.
>
> [0]: https://www.postgresql.org/message-id/CAEze2WgMySu2suO_TLvFyGY3URa4mAx22WeoEicnK%3DPCNWEMrA%40mail.gmail.com
> [1]: https://www.postgresql.org/message-id/20220127182838.ba3434dp2pe5vcia%40alap3.anarazel.de
>
> --
> Tristan Partin
> Neon (https://neon.tech)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-09-21 18:33:49 Re: scalability bottlenecks with (many) partitions (and more)
Previous Message Tom Lane 2024-09-21 18:22:49 Re: Docs pg_restore: Shouldn't there be a note about -n ?