Re: Re: Re: Allow Reg Expressions in Position function

From: Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com>
To: postgresql(at)walla(dot)co(dot)il
Cc: pgsqladmin(at)geoff(dot)dj, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Re: Re: Allow Reg Expressions in Position function
Date: 2018-08-20 16:07:35
Message-ID: CAD3a31UT7YO1oWNi8EUnYgxUKP-L4KXoeGemY0qQSpzoPSsqOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
>
> Gives 5. It's wrong.
>
>
> True. Though your SO example didn't have the https in it.

> For some reason, substring() returns the parenthesised subexpression
> rather than the top level..
>
> The comment in testregexsubstr does say that it does this, but it's not
> clear from the documentation at all, unless I'm missing where it says it.
>
> The description of substring (
https://www.postgresql.org/docs/9.6/static/functions-string.html) says "See
Section 9.7 for more information on pattern matching." Section 9.7.3
("POSIX Regular Expressions") says this:

The substring function with two parameters, substring(string from pattern),
provides extraction of a substring that matches a POSIX regular expression
pattern. It returns null if there is no match, otherwise the portion of the
text that matched the pattern. But* if the pattern contains any
parentheses, the portion of the text that matched the first parenthesized
subexpression* (the one whose left parenthesis comes first) is returned.
You can put parentheses around the whole expression if you want to use
parentheses within it without triggering this exception.
(https://www.postgresql.org/docs/9.6/static/functions-matching.html)

> You can work around this by putting parentheses around the whole
> expression, because that way the first subexpression is the whole match.
>
> db=# SELECT position(substring('https://www.webexample.com/s/help?' FROM
> '(/(s|b|t)/)') IN 'https://www.webexample.com/s/help?');
> position
> ----------
> 27
>
> Geoff
>
> Thanks for the correction!

A less-fragile way to do this is to comparing length of whole string to
length of matched string. See below.

> For the greater good I sent the email requesting to allow reg exp in the
> position functions.
> Not sure if you will implement it... Just wanted to let you know that the
> limited capabilities of this function create overhead.

Not clear to me if you're talking CPU overhead or syntax complexity. But
if you really want this function for yourself, you can have it.

CREATE OR REPLACE FUNCTION position_regex (text, text) RETURNS INT AS $$

SELECT COALESCE(LENGTH($2)-LENGTH(SUBSTRING($2,'('||$1||'.*)$'))+1,0);

$$ LANGUAGE SQL IMMUTABLE;

Cheers,

Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client>*
ken(dot)tanzer(at)agency-software(dot)org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request(at)lists(dot)sourceforge(dot)net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vijaykumar Jain 2018-08-20 16:47:07 Re: [External] Multiple COPY on the same table
Previous Message Nicolas Paris 2018-08-20 15:56:01 Re: Multiple COPY on the same table