Quick Links

Grouping By Similarity (using pg_trgm)?

From:	Cory Tucker <cory(dot)tucker(at)gmail(dot)com>
To:	"pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject:	Grouping By Similarity (using pg_trgm)?
Date:	2015-05-14 18:58:57
Message-ID:	CAG_=8kBQk-3eesdZ-7iAH-x0oCX=Ob7Qekbp-y9rEM71smBZrQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

[pg version 9.3 or 9.4]

Suppose I have a simple table:

create table data (
my_value TEXT NOT NULL
);
CREATE INDEX idx_my_value ON data USING gin(my_value gin_trgm_ops);

Now I would like to essentially do group by to get a count of all the
values that are sufficiently similar. I can do it using something like a
CROSS JOIN to join the table on itself, but then I still am getting all the
rows with duplicate counts.

Is there a way to do a group by query and only return a single "my_value"
column and a count of the number of times other values are similar while
also not returning the included similar values in the output, too?

Responses

Re: Grouping By Similarity (using pg_trgm)? at 2015-05-14 19:08:22 from David G. Johnston
Re: Grouping By Similarity (using pg_trgm)? at 2015-05-22 09:37:25 from Oleg Bartunov

Browse pgsql-general by date

	From	Date	Subject
Next Message	David G. Johnston	2015-05-14 19:08:22	Re: Grouping By Similarity (using pg_trgm)?
Previous Message	Bruce Momjian	2015-05-14 17:35:02	Re: ECPG SET CONNECTION