From: | Andreas Joseph Krogh <andreas(at)visena(dot)com> |
---|---|
To: | Shaozhong SHI <shishaozhong(at)gmail(dot)com> |
Cc: | pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Detecting repeated phrase in a string |
Date: | 2021-12-09 15:11:31 |
Message-ID: | VisenaEmail.50.2affbb94dee70e79.17d9fba1d20@tc7-visena |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
På torsdag 09. desember 2021 kl. 15:46:05, skrev Shaozhong SHI <
shishaozhong(at)gmail(dot)com <mailto:shishaozhong(at)gmail(dot)com>>:
Hi, Peter,
How to define word boundary as either by using
^ , space, or $
So that the following can be done
fox fox is a repeat
foxfox is not a repeat but just one word.
Do you want repeated phrase (list of words) ore repeated words?
For repeated words (including unicode-chars) you can do:
(\b\p{L}+\b)(?:\s+\1)+
I'm not quite sure how to translate this to PG, but in JAVA it works.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
From | Date | Subject | |
---|---|---|---|
Next Message | Phil Endecott | 2021-12-09 16:06:27 | Re: Advice on using materialized views |
Previous Message | Shaozhong SHI | 2021-12-09 14:46:05 | Re: Detecting repeated phrase in a string |