Re: tsearch2 and pdf files

From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Henrik Zagerholm" <henke(at)mac(dot)se>, "Philip Johnson" <philip(dot)johnson(at)atempo(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: tsearch2 and pdf files
Date: 2006-12-11 22:08:51
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCEA0FDF8@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> 1. Convert PDF to file with e.g xpdf
> 2. Insert parsed text to a table of your choice.
> 3. Make vectors from the text.

Actually, if you're not going to use the headline() function, you cna
just store it directly in a vector, cutting down on the size
requirements. Just insert to the to_tsvector() result. The full text is
required for headline() though, so you can't cheat on that.

//Magnus

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2006-12-11 22:48:35 Re: concatenation operator || with "null" array
Previous Message Henrik Zagerholm 2006-12-11 22:06:07 Re: tsearch2 and pdf files