Re: add function argument name to substring and substr

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: add function argument name to substring and substr
Date: 2025-03-19 00:18:28
Message-ID: CAKFQuwZ=ZEkQaFZBRpukNn-qvk4U52ZAsrLLRPAC4hgcZsJmkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 17, 2025 at 8:20 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:

> On Tue, Feb 18, 2025 at 6:13 AM David G. Johnston
> <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> >
> > Table 9.9 limits itself to those functions defined in the SQL standard;
> which are basically the ones that use keywords instead of commas.
> >
> > The substring(string, start, count) function you note is already covered
> in Table 9.10 but we spell it substr(...)
> >
> > I don't think adding yet more spellings of this same function is
> warranted or desirable at this point.
>
> ok.
>
> > I'd maybe add a note if substring(,,,) works to substr saying that
> substring is a valid alias. I could be convinced to just document though.
> >
> it seems already in the doc.
>
> substr ( string text, start integer [, count integer ] ) → text
> Extracts the substring of string starting at the start'th character,
> and extending for count characters if that is specified. (Same as
> substring(string from start for count).)
>
> substr ( bytes bytea, start integer [, count integer ] ) → bytea
> Extracts the substring of bytes starting at the start'th byte, and
> extending for count bytes if that is specified. (Same as
> substring(bytes from start for count).)
>
>
> new patch attached.
> main changes:
> 1. change 3 argument func argument from
> (string text, pattern text, escape_character text)
> to
> (string text, pattern text, escape text)
>

Why? It can only be one character so that existing name seems well chosen.

postgres=# select substring('123^^,123',',','^^');
ERROR: invalid escape string
HINT: Escape string must be empty or one character.
CONTEXT: SQL function "substring" statement 1

> 2. add synopsis section in 9.7.3. POSIX Regular Expressions for
> function substring.
> we only have the synopsis section for function substring in 9.7.2
> section, now add it to 9.7.3.
>

I'd probably try and resolve that the other way...point the reader to the
reference page for the function if they want to see syntax. I'm mixed on
(leaning against) whether choosing this place to demonstrate all the
possible spellings is the best.

also add an example about using named natation call substring:
> substring(string=>'foobar', pattern=> 'o.b')
>

I'm already pulling my hair out at this showing all the insanity that
exists without adding this to the mix.

The vast majority of examples throughout the manual use traditional
function call syntax func_name(arg1, arg2, etc.); I'd rather keep with
convention than start to scatter about alternative syntax choices just to
give the random reader who happens upon this fairly esoteric part of the
manual the benefit of seeing their options. If that is a goal, then I'd
suggest spending some time in our Tutorial adding some more examples with
these alternative forms to people looking to be exposed to new things in
the place they'd go to look for them. They probably won't learn about them
from the Syntax section.

On the plus side, I agree now we should add:
substring(string text, pattern text[, escape-character text])
to Table 9.10

I'd also rename escape to escape-character in the other SQL substring
function synopses. The RegEx page got that part correct.

Do as little or as much with the RegEx section as you'd like, though it
seems like separate material from $subject. The page seems to already use
replaceable names instead of data types so on that score it should be
unaffected if we've chosen the same names.

Food for thought, it seems a bit redundant to name the first argument
basically the same as the data type. I was thinking that "content" would
be a better choice. This is basically a polymorphic function where all the
inputs are the same thing just having different types - and that thing is
being "content".

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-03-19 00:41:17 Re: pgsql: aio: Infrastructure for io_method=worker
Previous Message Michael Paquier 2025-03-18 23:56:54 Re: Add Pipelining support in psql