Re: Counting the number of repeated phrases in a column

From: Rob Sargent <robjsargent(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Counting the number of repeated phrases in a column
Date: 2022-01-26 21:40:07
Message-ID: 7f204b3c-3224-f8cb-a841-879f57ebf120@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 1/26/22 13:35, Shaozhong SHI wrote:
>
>
> On Tue, 25 Jan 2022 at 17:10, Shaozhong SHI <shishaozhong(at)gmail(dot)com>
> wrote:
>
> There is a short of a function in the standard Postgres to do the
> following:
>
> It is easy to count the number of occurrence of words, but it is
> rather difficult to count the number of occurrence of phrases.
>
> For instance:
>
> A cell of value:  'Hello World' means 1 occurrence a phrase.
>
> A cell of value: 'Hello World World Hello' means no occurrence of
> any repeated phrase.
>
> But, A cell of value: 'Hello World World Hello Hello World' means
> 2 occurrences of 'Hello World'.
>
> 'The City of London, London' also has no occurrences of any
> repeated phrase.
>
> Anyone has got such a function to check out the number of
> occurrence of any repeated phrases?
>
> Regards,
>
> David
>
>
> Hi, All Friends,
>
> Whatever.   Can we try to build a regex for   'The City of London
> London Great London UK ' ?
>
> It could be something like '[\w\s]+[\s-]+[a-z]+[\s-][\s\w]+'.
>  [\s-]+[a-z]+[\s-] is catered for some people think that 'City of
> London' is 'City-of-London' or 'City-of-London'.
>
> Regards,
>
> David
Do you really want "The City of", by itself, to be one of the detected
phrases?  eg 'The City of London London Great London UK The City of
Liverpool'.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Karsten Hilbert 2022-01-26 22:09:44 Re: Counting the number of repeated phrases in a column
Previous Message David G. Johnston 2022-01-26 20:55:06 Counting the number of repeated phrases in a column