Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "Karl O(dot) Pinc" <kop(at)karlpinc(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector
Date: 2023-11-30 10:32:28
Message-ID: CAHv8RjLFDuhPf+x6Kfn=5KRXRLrmPWKR-3jdcdxrGS8-Vf1avQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 30, 2023 at 3:59 PM David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> Extending my prior email which is now redundant.
>
> On Tue, Oct 3, 2023 at 7:00 PM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>>
>> On Tue, Oct 3, 2023 at 4:15 PM Karl O. Pinc <kop(at)karlpinc(dot)com> wrote:
>>>
>>> On Tue, 3 Oct 2023 14:51:31 -0700
>>> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>>>
>>> Isn't the entire section about "deviating from the normal flow of the
>>> code"? That's what makes me want "Exception" in the section title.
>>
>>
>> It is about how error handling in a procedure diverts the flow from the normal code path to some other code path - what that path is labelled is less important than the thing that causes the diversion - an error.
>>
>>>
>>> ? I remain (overly?) focused on the word "exception", since that's
>>> whats in the brain of the user that's writing RAISE EXCEPTION.
>>> It matters if exceptions and errors are different. If they're
>>> not, then it also matters since it's exceptions that the user's
>>> code raises.
>>>
>>
>> It's unfortunate the keyword to raise the message level "ERROR" is "EXCEPTION" in that command but I'd rather simply handle that one anomaly that make the rest of the system use the word exception, especially seem to be fairly consistent in our usage of ERROR already. I'm sympathetic that other systems out there also encourage the usage of exception in this context instead of error - but not to the point of opening up this long-standing decision for rework.
>>
>>>
>>> Have you any thoughts on the "permissions", "privleges" and
>>> "attributes" vocabulary/concepts used in this area?
>>
>>
>> I think we benefit from being able to equate permissions and privileges and trying to separate them is going to be more harmful than helpful. The limited things that role attributes permit, and how they fall outside the privilege/permission concept as we use it, isn't something that I've noticed is a problem that needs addressing.
>>
>>
>>> (I'm slightly
>>> nervous about the renumbering making the thread hard to follow.)
>>>
>>
>> 0009 - Something just seems off with this one. Unless there are more places with this type in use I would just move the relevant notes (i.e., the one in proallargtypes) to that column and be done with it. If there are multiple places then moving the notes to the main docs and cross-referencing to them seems warranted. I also wouldn't call it legacy.
>>
>> 0010 -
>>
>> When creating new objects, if a schema qualification is not given with the name the first extant entry in the search_path is chosen; then an error will be raised should the supplied name already exist in that schema.
>> In contexts where the object must already exist, but its name is not schema qualified, the extant search_path schemas will be consulted serially until one of them contains an appropriate object, returning it, or all schemas are consulted, resulting in an object not found error.
>>
>> I'm not seeing much value in presenting the additional user/public details here. Especially as it would then seem appropriate to include pg_temp. And now we have to deal with the fact that by default the public schema isn't so public anymore.
>>
>
> 0011 - (first pass, going from memory, might have missed some needed details)
>
> Aside from non-atomic SQL routine bodies (functions and procedures) the result of the server executing SQL sent by the connected client does not result in raw SQL, or textual expressions, being stored for later evaluation. All objects are identified (or created) during execution and their effects stored within the system catalogs and assigned system identifiers (oids) to provide an absolute and immutable reference to be used while establishing inter-object dependencies. In short, indirect actions taken by the server, based upon stored knowledge, can and often will execute while in a search_path that only contains the pg_catalog schema so that the stored knowledge can be found.
>
> For routines written in any language except Atomic SQL the textual body of the routine is stored as-is within the database. When executing such a routine the (parent) session basically opens up a new connection to the server (one per routine) and within that new sub-session sends the SQL contained within the routine to the server for execution just like any other client, and therefore any object references present in that SQL need to be resolved to a schema as previously discussed. By default, upon connecting, the newly created session is updated so that its settings take on the current values in the parent session. When authoring a routine this is often undesirable as the behavior of the routine now depends upon an environment that is not definitively known to the routine author. Schema-qualifying object references within the routine body is one tool to remove such uncertainty. Another is by using the SET clause of the relevant CREATE SQL Command to specify what the value of important settings are to be.
>
> The key takeaway from the preceding two paragraphs is that because routines are stored as text and their settings resolved at execution time, and indirect server actions can invoke those routines with a pg_catalog only search_path, any routine that potentially can be invoked in that manner and makes use of search_path should either be modified to eliminate such use or define the required search_path via the SET option during its creation or replacement.
>
> 0012 - (this has changed recently too, I'm not sure how this fits within the rest. I still feel like something is missing even in my revision but not sure what or if it is covered sufficiently nearby)
>
> All roles are ultimately owned and managed by the bootstrap superuser, who can establish trees of groups and users upon which the object permission granting system works. By enabling the CREATEROLE attribute on a user a superuser can delegate role creation to other people (it is inadvisable to enable CREATEROLE on a group) who can then construct their own trees of groups and users.
>
> (not sure how true this is still but something to consider in terms of big picture role setups)
> It is likewise inadvisable to create multiple superusers since in practice their actions in many cases can be made to look attributable to the bootstrap superuser. It is necessary to enlist services outside of PostgreSQL to adequately establish auditing in a multi-superuser setup.
>
> Note my intentional use of users and groups here. We got rid of the distinction with CREATE ROLE but in terms of system administration they still have, IMO, significant utility.
>
> 0013 - +1
> 0014 - +1
>
> 0015 - I'd almost rather only note in CREATE FUNCTION that PARALLEL does not matter for a trigger returning function as triggers only execute in cases of data writing which precludes using parallelism. Which is indeed what the documentation explicitly calls out in "When Can Parallel Query Be Used?" so it isn't inference from omission.
>
> I don't have a problem saying in the trigger documentation, maybe at the very end:
>
> The functions that triggers execute are more appropriately considered procedures but since the later feature did not exist when triggers were implemented precedent compels the dba to write their routines as functions. As a consequence, function attributes such as PARALLEL, and WINDOW, are possible to define on a function that is to be used as a trigger but will have no effect. (though I would think at least some of these get rejected outright)
>
> 0016 - not within my knowledge base
>
>I reviewed the Patch and found a few changes. Please have a look at them:
-v7-0002-Change-section-heading-to-better-describe-referen.patch

"Re-Using the Type of Columns and Variables" seems adequate. Getting
something in there about declartions seems too wordy. I thought
perhaps "Referencing" instead of "Re-Using", but "referencing" isn't
perfect and "re-using" is generic enough, shorter, and simpler to

Here 'declartions' should be replaced with 'declarations'.

-v7-0012-Explain-role-management.patch

+ The managment of most database objects is by way of granting some role

Here 'managment' should be replaced with 'management'.

-v7-0013-Hyperlink-from-CREATE-FUNCTION-reference-page-to-.patch

Is is nice to have a link in the reference material to a full discussion.

Here 'is' should be removed.

-v7-0015-Trigger-authors-need-not-worry-about-parallelism.patch

Plus, this patch adds an index entry so the new verbage is easy to find
for those who do investigate.

Here 'verbage' should be replaced with 'verbiage'.

-v7-0016-Predicate-locks-are-held-per-cluster-not-per-data.patch

This is a significant corner case and so should be documented. It is
also somewhat suprising since the databases within a cluster are
otherwise isolated, at least from the user's perspective.

Here 'suprising' should be replaced with 'surprising'.

Predicate locks are held per-cluster, not per database.
+ This means that serializeable transactions in one database can have
+ effects in another.
+ Long running serializeable transactions, as might occur accidentally
+ when
+ <link linkend="app-psql-meta-command-pset-pager">pagination</link>
+ halts <link linkend="app-psql">psql</link> output, can have
+ significant inter-database effects.
+ These include exhausting available predicate locks and
+ cluster-wide <link linkend="ports12">WAL checkpoint delay</link>.
+ When making use of serializeable transactions consider having
+ separate clusters for production and non-production use.

Here 'serializeable' should be replaced with 'serializable'.

Thanks and Regards,
Shubham Khanna.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-11-30 10:56:08 Re: [Proposal] global sequence implemented by snowflake ID
Previous Message Gabriele Bartolini 2023-11-30 10:29:15 Extend pgbench partitioning to pgbench_history