RFC: Extension Packaging & Lookup

From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RFC: Extension Packaging & Lookup
Date: 2024-10-10 20:34:08
Message-ID: 2CAD6FA7-DC25-48FC-80F2-8F203DECAE6A@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hackers,

Back at the end of August, I promised[1]:

> I’ll try to put some thought into a more formal proposal in a new thread next week. Unless your Gabriele beats me to it πŸ˜‚.

I guess I should get off my butt and do it. So let’s do this. Here’s what I propose.

* When an extension is installed, all of its files should live in a single directory. These include:

* The Control file in directory describes extension
* Subdirectories for SQL, shared libraries, docs, binaries
(also locales and tsearch dictionaries?)

* Next, there should be an extension lookup path. The first item in the path is the compile-time default, and ideally would include only core extensions. Subsequent paths would be set by a GUC, similar to dynamic_library_path, but only for extensions (including their shared libraries).

* Modify PGXS (or create a new installer CLI used by PGXS?) to install an extension according to this pattern. Allow the specification of a prefix. This should differ from the current `PREFIX`, in that the values of `sharedir`, `pkglibdir`, etc. would not be fully-duplicated under the prefix, but point to a directory used in the extension path. For example, when installing an extension need β€œpair", something like

make install BASE_DIR=/opt/pg/extension

Would create `/opt/pg/extension/pair`, rather than `/opt/pg/extension/$(pg_config --sharedir)/extension/pair`.

* Perhaps there could also be an option to symlink binary files or man pages to keep paths simple.

* For CREATE EXTENSION, Postgres would look for an extension on the file system by directory name in each of the extension paths instead of control file name. It would then find the control file in that directory and the necessary SQL and shared library files in the `sql` and `lib` subdirectories of that directory.

* Binary-only extensions might also be installed here; the difference is they have no control file. The LOAD command and shared_preload_libraries would need to know to look here, too.

The basic idea, then, is three-fold:

1. This pattern is more like a packaging pattern than CREATE EXTENSION-specific, since it includes other types of extensions

2. All the files for a given extension live within a single directory, making it easier to reason about what’s installed and what’s not.

3. These extension packages can live in multiple paths.

Some examples. Core extensions, like citext, would live in, say, $(pg_config --extensiondir)/citext), and have a structure such as:

```
citext
β”œβ”€β”€ citext.control
β”œβ”€β”€ lib
β”‚ β”œβ”€β”€ citext.dylib
β”‚ └── bitcode
β”‚ β”œβ”€β”€ citext
β”‚ β”‚ └── citext.bc
β”‚ └── citext.index.bc
└── sql
β”œβ”€β”€ citext--1.0--1.1.sql
β”œβ”€β”€ citext--1.1--1.2.sql
β”œβ”€β”€ citext--1.2--1.3.sql
β”œβ”€β”€ citext--1.3--1.4.sql
β”œβ”€β”€ citext--1.4--1.5.sql
β”œβ”€β”€ citext--1.4.sql
└── citext--1.5--1.6.sql
```

Third-party extensions would live in one or more other directories on the file system, unknown at compile time, but set in the extension path GUC and accessible to/owned by the Postgres system user. Let’s say we set `/opt/pgxn` as one of the paths. Within that directory, we might have a directory for a pure SQL extension in a a directory named β€œpair” that looks like this:

```
pair
β”œβ”€β”€ LICENSE.md
β”œβ”€β”€ README.md
β”œβ”€β”€ pair.control
β”œβ”€β”€ doc
β”‚ β”œβ”€β”€ html
β”‚ β”‚ └── pair.html
β”‚ └── pair.md
└── sql
β”œβ”€β”€ pair--1.0--1.1.sql
└── pair--1.1.sql
```

A binary application like pg_top would live in the pg_top directory, structured something like:

```
pg_top
β”œβ”€β”€ HISTORY.rst
β”œβ”€β”€ INSTALL.rst
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.rst
β”œβ”€β”€ bin
| └── pg_top
└── doc
└── man
└── man3
└── pg_top.3
```

And a C extension like semver would live in the semver directory and be structured something like:

```
semver
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md
β”œβ”€β”€ semver.control
β”œβ”€β”€ doc
β”‚ └── semver.md
β”œβ”€β”€ lib
β”‚ β”œβ”€β”€ semver.dylib
β”‚ β”œβ”€β”€ bitcode
β”‚ └── semver
β”‚ β”‚ └── semver.bc
β”‚ └── semver.index.bc
└── sql
β”œβ”€β”€ semver--1.0--1.1.sql
└── semver--1.1.sql
```

Thoughts?

Best,

David

[1]: https://www.postgresql.org/message-id/D30A91FA-A6D4-4737-941F-0BBB2984B730%40justatheory.com

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2024-10-10 20:35:42 Re: RFC: Additional Directory for Extensions
Previous Message Mikael Sand 2024-10-10 20:33:19 Re: Build issue with postgresql 17 undefined reference to `pg_encoding_to_char' and `pg_char_to_encoding'