Re: Detecting repeated phrase in a string

From: Shaozhong SHI <shishaozhong(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Detecting repeated phrase in a string
Date: 2021-12-09 14:46:05
Message-ID: CA+i5JwaEwK=ktV-H-xS2dHgGfWL0RPRDVhcghJ5rQM45DqLY-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi, Peter,

How to define word boundary as either by using
^ , space, or $

So that the following can be done

fox fox is a repeat

foxfox is not a repeat but just one word.

Regards,

David

On Thu, 9 Dec 2021 at 13:35, Peter J. Holzer <hjp-pgsql(at)hjp(dot)at> wrote:

> On 2021-12-09 12:38:15 +0000, Shaozhong SHI wrote:
> > Does anyone know how to detect repeated phrase in a string?
>
> Use regular expressions with backreferences:
>
> bayes=> select regexp_match('foo wikiwiki bar', '(.+)\1');
> ╔══════════════╗
> ║ regexp_match ║
> ╟──────────────╢
> ║ {o} ║
> ╚══════════════╝
> (1 row)
>
> "o" is repeated in "foo".
>
> bayes=> select regexp_match('fo wikiwiki bar', '(.+)\1');
> ╔══════════════╗
> ║ regexp_match ║
> ╟──────────────╢
> ║ {wiki} ║
> ╚══════════════╝
> (1 row)
>
> "wiki" is repeated in "wikiwiki".
>
> bayes=> select regexp_match('fo wikiwi bar', '(.+)\1');
> ╔══════════════╗
> ║ regexp_match ║
> ╟──────────────╢
> ║ (∅) ║
> ╚══════════════╝
> (1 row)
>
> nothing is repeated.
>
> Adjust the expression within parentheses if you want to match somethig
> more specific than any sequence of one or more characters.
>
> hp
>
> --
> _ | Peter J. Holzer | Story must make more sense than reality.
> |_|_) | |
> | | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
> __/ | http://www.hjp.at/ | challenge!"
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Joseph Krogh 2021-12-09 15:11:31 Re: Detecting repeated phrase in a string
Previous Message Avi Weinberg 2021-12-09 14:19:35 RE: Identity/Serial Column In Subscriber's Tables