Re: Detecting repeated phrase in a string

From: Andreas Joseph Krogh <andreas(at)visena(dot)com>
To: Shaozhong SHI <shishaozhong(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Detecting repeated phrase in a string
Date: 2021-12-09 15:11:31
Message-ID: VisenaEmail.50.2affbb94dee70e79.17d9fba1d20@tc7-visena
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


På torsdag 09. desember 2021 kl. 15:46:05, skrev Shaozhong SHI <
shishaozhong(at)gmail(dot)com <mailto:shishaozhong(at)gmail(dot)com>>:

Hi, Peter,

How to define word boundary as either by using
^ , space, or $

So that the following can be done

fox fox is a repeat

foxfox is not a repeat but just one word.

Do you want repeated phrase (list of words) ore repeated words?
For repeated words (including unicode-chars) you can do:

(\b\p{L}+\b)(?:\s+\1)+

I'm not quite sure how to translate this to PG, but in JAVA it works.

--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Phil Endecott 2021-12-09 16:06:27 Re: Advice on using materialized views
Previous Message Shaozhong SHI 2021-12-09 14:46:05 Re: Detecting repeated phrase in a string