From: | Shaozhong SHI <shishaozhong(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Detecting repeated phrase in a string |
Date: | 2021-12-09 14:46:05 |
Message-ID: | CA+i5JwaEwK=ktV-H-xS2dHgGfWL0RPRDVhcghJ5rQM45DqLY-g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi, Peter,
How to define word boundary as either by using
^ , space, or $
So that the following can be done
fox fox is a repeat
foxfox is not a repeat but just one word.
Regards,
David
On Thu, 9 Dec 2021 at 13:35, Peter J. Holzer <hjp-pgsql(at)hjp(dot)at> wrote:
> On 2021-12-09 12:38:15 +0000, Shaozhong SHI wrote:
> > Does anyone know how to detect repeated phrase in a string?
>
> Use regular expressions with backreferences:
>
> bayes=> select regexp_match('foo wikiwiki bar', '(.+)\1');
> ╔══════════════╗
> ║ regexp_match ║
> ╟──────────────╢
> ║ {o} ║
> ╚══════════════╝
> (1 row)
>
> "o" is repeated in "foo".
>
> bayes=> select regexp_match('fo wikiwiki bar', '(.+)\1');
> ╔══════════════╗
> ║ regexp_match ║
> ╟──────────────╢
> ║ {wiki} ║
> ╚══════════════╝
> (1 row)
>
> "wiki" is repeated in "wikiwiki".
>
> bayes=> select regexp_match('fo wikiwi bar', '(.+)\1');
> ╔══════════════╗
> ║ regexp_match ║
> ╟──────────────╢
> ║ (∅) ║
> ╚══════════════╝
> (1 row)
>
> nothing is repeated.
>
> Adjust the expression within parentheses if you want to match somethig
> more specific than any sequence of one or more characters.
>
> hp
>
> --
> _ | Peter J. Holzer | Story must make more sense than reality.
> |_|_) | |
> | | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
> __/ | http://www.hjp.at/ | challenge!"
>
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Joseph Krogh | 2021-12-09 15:11:31 | Re: Detecting repeated phrase in a string |
Previous Message | Avi Weinberg | 2021-12-09 14:19:35 | RE: Identity/Serial Column In Subscriber's Tables |