Quick Links

Re: Grouping By Similarity (using pg_trgm)?

From:	Oleg Bartunov <obartunov(at)gmail(dot)com>
To:	Cory Tucker <cory(dot)tucker(at)gmail(dot)com>
Cc:	"pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Grouping By Similarity (using pg_trgm)?
Date:	2015-05-22 09:37:25
Message-ID:	CAF4Au4yzCzJ7VnfGhMnY_9BjqzCuGKsxBxaLXrfJtkLeurcU-Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Have you seen http://www.sai.msu.su/~megera/postgres/talks/pgcon-2012.pdf ?

On Thu, May 14, 2015 at 9:58 PM, Cory Tucker <cory(dot)tucker(at)gmail(dot)com> wrote:

> [pg version 9.3 or 9.4]
>
> Suppose I have a simple table:
>
> create table data (
> my_value TEXT NOT NULL
> );
> CREATE INDEX idx_my_value ON data USING gin(my_value gin_trgm_ops);
>
>
> Now I would like to essentially do group by to get a count of all the
> values that are sufficiently similar. I can do it using something like a
> CROSS JOIN to join the table on itself, but then I still am getting all the
> rows with duplicate counts.
>
> Is there a way to do a group by query and only return a single "my_value"
> column and a count of the number of times other values are similar while
> also not returning the included similar values in the output, too?
>
>

In response to

Grouping By Similarity (using pg_trgm)? at 2015-05-14 18:58:57 from Cory Tucker

Browse pgsql-general by date

	From	Date	Subject
Next Message	Nicklas Avén	2015-05-22 09:51:34	Different result depending on order of joins
Previous Message	Gilles Darold	2015-05-22 09:16:03	Re: date with month and year