From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
---|---|
To: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] More docs on what to do and not do in extension code |
Date: | 2021-05-30 11:19:55 |
Message-ID: | 07de8fed255f31aa2dfda8efbe839df463f143c6.camel@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 2021-01-18 at 15:56 +0800, Craig Ringer wrote:
> The attached patch expands the xfunc docs and bgworker docs a little, providing a starting point for developers
> to learn how to do some common tasks the Postgres Way.
I like these changes!
Here is a review:
+ <para>
+ See <xref linkend="xfunc-shared-addin"/> for information on how to
+ request extension shared memory allocations, <literal>LWLock</literal>s,
+ etc.
+ </para>
This doesn't sound very English to me, and I don't see the point in
repeating parts of the enumeration. How about
See ... for detailed information how to allocate these resources.
+ <para>
+ If a background worker needs to sleep or wait for activity it should
Missing comma after "activity".
+ always use <link linkend="xfunc-sleeping-interrupts-blocking">PostgreSQL's
+ interupt-aware APIs</link> for the purpose. Do not <function>usleep()</function>,
+ <function>system()</function>, make blocking system calls, etc.
+ </para>
"system" is not a verb. Suggestion:
Do not use <function>usleep()</function>, <function>system()</function>
or other blocking system calls.
+ <para>
+ The <filename>src/test/modules/worker_spi</filename> and
+ <filename>src/test/modules/test_shm_mq</filename> contain working examples
+ that demonstrates some useful techniques.
</para>
That is missing a noun in my opinion, I would prefer:
The modules ... contain working examples ...
+ <sect1 id="bgworker-signals" xreflabel="Signal Handling in Background Workers">
+ <title>Signal Handling in Background Workers</title>
It is not a good idea to start a section in the middle of a documentation page.
That will lead to a weird TOC entry at the top of the page.
The better way to do this is to write a short introductory header and convert
most of the first half of the page into another section, so that we end up
with a page the has the introductory material and two TOC entries for the details.
+ The default signal handlers installed for background workers <emphasis>do
+ not interrupt queries or blocking calls into other postgres code</emphasis>
<productname>PostgreSQL</productname>, not "postgres".
Also, there is a comma missing at the end of the line.
+ so they are only suitable for simple background workers that frequently and
+ predictably return control to their main loop. If your worker uses the
+ default background worker signal handling it should call
Another missing comma after "handling".
+ <function>HandleMainLoopInterrupts()</function> on each pass through its
+ main loop.
+ </para>
+
+ <para>
+ Background workers that may make blocking calls into core PostgreSQL code
+ and/or run user-supplied queries should generally replace the default bgworker
Please stick with "background worker", "bgworker" is too sloppy IMO.
+ signal handlers with the handlers used for normal user backends. This will
+ ensure that the worker will respond in a timely manner to a termination
+ request, query cancel request, recovery conflict interrupt, deadlock detector
+ request, etc. To install these handlers, before unblocking interrupts
+ run:
The following would be more grammatical:
To install these handlers, run the following before unblocking interrupts:
+ Then ensure that your main loop and any other code that could run for some
+ time contains <function>CHECK_FOR_INTERRUPTS();</function> calls. A
+ background worker using these signal handlers must use <link
+ linkend="xfunc-resource-management">PostgreSQL's resource management APIs
+ and callbacks</link> to handle cleanup rather than relying on control
+ returning to the main loop because the signal handlers may call
There should be a comma before "because".
+ <function>proc_exit()</function> directly. This is recommended practice
+ for all types of extension code anyway.
+ </para>
+
+ <para>
+ See the comments in <filename>src/include/miscadmin.h</filename> in the
+ postgres headers for more details on signal handling.
+ </para>
"in the postgres headers" is redundant - at any rate, it should be "PostgreSQL".
+ Do not attempt to use C++ exceptions or Windows Structured Exception
+ Handling, and never call <function>exit()</function> directly.
I am alright with this addition, but I think it would be good to link to
<xref linkend="extend-cpp"/> from it.
+ <listitem id="xfunc-threading">
+ <para>
+ Individual PostgreSQL backends are single-threaded.
+ Never call any PostgreSQL function or access any PostgreSQL-managed data
+ structure from a thread other than the main
"PostgreSQL" should always have the <productname> tag.
This applies to a lot of places in this patch.
+ thread. If at all possible your extension should not start any threads
Comma after "possible".
+ and should only use the main thread. PostgreSQL generally uses subprocesses
Hm. If the extension does not start threads, it automatically uses the main thread.
I think that should be removed for clarity.
+ that coordinate over shared memory instead of threads - see
+ <xref linkend="bgworker"/>.
It also uses signals and light-weight locks - but I think that you don't need to
describe the coordination mechanisms here, which are explained in the link you added.
+ primitives like <function>WaitEventSetWait</function> where necessary. Any
+ potentially long-running loop should periodically call <function>
+ CHECK_FOR_INTERRUPTS()</function> to give PostgreSQL a chance to interrupt
+ the function in case of a shutdown request, query cancel, etc. This means
Are there other causes than shutdown or query cancellation?
At any rate, I am not fond of enumerations ending with "etc".
+ you should <function>sleep()</function> or <function>usleep()</function>
You mean: "you should *not* use sleep()"
+ for any nontrivial amount of time - use <function>WaitLatch()</function>
"—" would be better than "-".
+ or its variants instead. For details on interrupt handling see
Comma after "handling".
[...]
+ based cleanup. Your extension function could be terminated mid-execution
... could be terminated *in* mid-execution ...
+ by PostgreSQL if any function that it calls makes a
+ <function>CHECK_FOR_INTERRUPTS()</function> check. It may not
"makes" sound kind of clumsy in my ears.
+ Spinlocks, latches, condition variables, and more. Details on all of these
+ is far outside the scope of this document, and the best reference is
+ the relevant source code.
I don't think we have to add that last sentence. That holds for pretty much
everything in this documentation.
+ <para>
+ Sometimes relation-based state management for extensions is not
+ sufficient to meet their needs. In that case the extension may need to
Better:
Sometimes, relation-based state management is not sufficient to meet the
needs of an extension.
+ use PostgreSQL's shared-memory based inter-process communication
+ features, and should certainly do so instead of inventing its own or
+ trying to use platform level features. An extension may use
+ <link linkend="xfunc-shared-addin">"raw" shared memory requested from
+ the postmaster at startup</link> or higher level features like dynamic
+ shared memory segments (<acronym>DSM</acronym>),
+ dynamic shared areas (<acronym>DSA</acronym>),
+ or shared memory message queues (<acronym>shm_mq</acronym>). Examples
+ of the use of some these features can be found in the
+ <filename>src/test/modules/test_shm_mq/</filename> example extension. Others
+ can be found in various main PostgreSQL backend code.
+ </para>
Instead of the last sentence, I'd prefer
... or other parts of the source code.
+ <listitem id="xfunc-relcache-syscache">
+ <para>
+ Look up system catalogs and table information using the relcache and syscache
How about "table metadata" rather than "table information"?
+ APIs (<function>SearchSysCache...</function>,
+ <function>relation_open()</function>, etc) rather than attempting to run
+ SQL queries to fetch the information. Ensure that your function holds
+ any necessary locks on the target objects. Take care not to make any calls
... holds *the* necessary locks ...
+ that could trigger cache invalidations while still accessing any
+ syscache cache entries, as this can cause subtle bugs. See
Subtle? Perhaps "bugs that are hard to find".
+ <filename>src/backend/utils/cache/syscache.c</filename>,
+ <filename>src/backend/utils/cache/relcache.c</filename>,
+ <filename>src/backend/access/common/relation.c</filename> and their
+ headers for details.
+ </para>
+ </listitem>
Attached is a new version that has my suggested changes, plus a few from
Bharath Rupireddy (I do not agree with many of his suggestions).
Yours,
Laurenz Albe
Attachment | Content-Type | Size |
---|---|---|
v2-0002-Expand-docs-on-PostgreSQL-extension-coding.patch | text/x-patch | 15.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Asif Rehman | 2021-05-30 12:38:36 | Re: [PATCH] expand the units that pg_size_pretty supports on output |
Previous Message | Fabien COELHO | 2021-05-30 09:09:41 | psql - factor out echo code |