From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Shaozhong SHI <shishaozhong(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Can Postgres beat Oracle for regexp_count? |
Date: | 2022-02-03 05:50:35 |
Message-ID: | CAKFQuwZ8PUnDWC645Oa1hhbT4LqRO2Kc3GuYyxNiuRU3OLhT1Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Feb 2, 2022 at 10:26 PM Shaozhong SHI <shishaozhong(at)gmail(dot)com>
wrote:
>
> select regexp_matches('My High Street', '([A-Z][a-z]+[\s]*)+', 'g')
> It is intended to match 'My High Street, but it turned out only 'Street'
> was matched.
>
>
I'm too tired to find the documentation for why you saw your result but
basically you only have a single capturing parentheses pair and since
you've quantified that you end up with just the last capture that was found
- Street. If you want to capture the entire found expression you need to
capture the quantifier. So put parentheses around the entire regexp.
select regexp_matches('My High Street', '(([A-Z][a-z]+[\s]*)+)', 'g')
You now have a two element array, slots filled left-to-right based upon the
opening parenthesis. So {"My High Street",Street}
To get rid of the undesired Street and only return a single element array
you need to make the inner parentheses non-capturing.
select regexp_matches('My High Street', '((?:[A-Z][a-z]+[\s]*)+)', 'g')
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-02-03 05:59:36 | Re: Can Postgres beat Oracle for regexp_count? |
Previous Message | Shaozhong SHI | 2022-02-03 05:26:14 | Re: Can Postgres beat Oracle for regexp_count? |