From: | Shaozhong SHI <shishaozhong(at)gmail(dot)com> |
---|---|
To: | Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net> |
Cc: | pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Counting the number of repeated phrases in a column |
Date: | 2022-01-25 17:30:31 |
Message-ID: | CA+i5JwaMeZbhJqeMeh1YafMMHQ2V5-ci2mJ-Bes4yn_MCcWbQQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
How about split up the value into individual words and keep their orders?
add words up to form individual phrase and ensure that each phrase only
consists unique/distinct words
count repeated phrases afterward
How about this?
Regards,
David
On Tue, 25 Jan 2022 at 17:22, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
wrote:
> > There is a short of a function in the standard Postgres to do the
> following:
> >
> > it is easy to count the number of occurrence of words, but it is rather
> difficult to count the number of occurrence of phrases.
> >
> > For instance:
> >
> > A cell of value: 'Hello World' means 1 occurrence a phrase.
> >
> > A cell of value: 'Hello World World Hello' means no occurrence of any
> repeated phrase.
> >
> > But, A cell of value: 'Hello World World Hello Hello World' means 2
> occurrences of 'Hello World'.
> >
> > 'The City of London, London' also has no occurrences of any repeated
> phrase.
> >
> > Anyone has got such a function to check out the number of occurrence of
> any repeated phrases?
>
> For that to become answerable you may want to define what to
> do when facing ambiguity.
>
> Best,
> Karsten
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Karsten Hilbert | 2022-01-25 17:40:24 | Aw: Re: Counting the number of repeated phrases in a column |
Previous Message | Karsten Hilbert | 2022-01-25 17:22:06 | Aw: Counting the number of repeated phrases in a column |