Quick Links

Re: Find similar records (compare tsvectors)

From:	Patrick Dung <patrick_dkt(at)yahoo(dot)com(dot)hk>
To:	Patrick Dung <patrick_dkt(at)yahoo(dot)com(dot)hk>, Pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Find similar records (compare tsvectors)
Date:	2015-03-06 14:05:59
Message-ID:	484316264.2758984.1425650759221.JavaMail.yahoo@mail.yahoo.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Resend.
How to quickly compare the similarity of two tsvector?

On Monday, March 2, 2015 11:01 PM, Patrick Dung <patrick_dkt(at)yahoo(dot)com(dot)hk> wrote:

Hello,
I had a database with articles or attachment stored in bytea format.I also had a trigger: it insert/update the tsv column when a record is added/updated.The tsv column had a GIN index.With this setting, I can do very fast keyword search on the tsv.
Suppose I had a specific record (id=100000).How to list similar records based on ranking?In that case, I had to compare a tsvector with another tsvector.
I had this SQL which make the original tsv as a text and then to tsquery, Then I can compare a tsv and a tsquery.
SELECT ts_rank(i.tsv, replace(strip(original.tsv)::text, ' ', '|')::tsquery) as similarity, i.company, i.industry, i.post_timestamp, i.id FROM items i, (SELECT tsv, id FROM items WHERE id=100000) AS original WHERE i.id != original.id ORDER BY similarity;
items table:id bigint
company varchar
industry varchardescription varcharpost_timestamp timestampattachment bytea
tsv tsvector

The problem is that this is very slow.Any comment?
Thank and regards,Patrick

In response to

Find similar records (compare tsvectors) at 2015-03-02 14:57:56 from Patrick Dung

Responses

Re: Find similar records (compare tsvectors) at 2015-03-06 16:06:26 from Oleg Bartunov

Browse pgsql-general by date

	From	Date	Subject
Next Message	Alvaro Herrera	2015-03-06 14:26:43	Re: VACUUM FULL doesn't reduce table size
Previous Message	Kevin Grittner	2015-03-06 14:02:22	Re: #PERSONAL# Reg: date going as 01/01/0001