| From: | Cory Tucker <cory(dot)tucker(at)gmail(dot)com> |
|---|---|
| To: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
| Subject: | Grouping By Similarity (using pg_trgm)? |
| Date: | 2015-05-14 18:58:57 |
| Message-ID: | CAG_=8kBQk-3eesdZ-7iAH-x0oCX=Ob7Qekbp-y9rEM71smBZrQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
[pg version 9.3 or 9.4]
Suppose I have a simple table:
create table data (
my_value TEXT NOT NULL
);
CREATE INDEX idx_my_value ON data USING gin(my_value gin_trgm_ops);
Now I would like to essentially do group by to get a count of all the
values that are sufficiently similar. I can do it using something like a
CROSS JOIN to join the table on itself, but then I still am getting all the
rows with duplicate counts.
Is there a way to do a group by query and only return a single "my_value"
column and a count of the number of times other values are similar while
also not returning the included similar values in the output, too?
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David G. Johnston | 2015-05-14 19:08:22 | Re: Grouping By Similarity (using pg_trgm)? |
| Previous Message | Bruce Momjian | 2015-05-14 17:35:02 | Re: ECPG SET CONNECTION |