From: | Henrik Zagerholm <henke(at)mac(dot)se> |
---|---|
To: | Philip Johnson <philip(dot)johnson(at)atempo(dot)com> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: tsearch2 and pdf files |
Date: | 2006-12-11 22:06:07 |
Message-ID: | 179575E2-2F49-427F-9961-CEE966187950@mac.se |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
1. Convert PDF to file with e.g xpdf
2. Insert parsed text to a table of your choice.
3. Make vectors from the text.
Cheers,
11 dec 2006 kl. 18:23 skrev Philip Johnson:
> Do you know what kind of table should I use ?
> Is there a shell script or a php script that does the work ?
>
> regards
>
>> -----Message d'origine-----
>> De : pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
>> owner(at)postgresql(dot)org] De la part de Hannes Dorbath
>> Envoyé : lundi 11 décembre 2006 12:21
>> À : pgsql-general(at)postgresql(dot)org
>> Objet : Re: [GENERAL] tsearch2 and pdf files
>>
>> You just need software that extracts the text from it. Search
>> google for
>> pdf2txt and others. Printer drivers that try to get text from
>> anything
>> are available as well.
>>
>>
>> On 11.12.2006 11:41, Philip Johnson wrote:
>>> I'm using Postgresql 8.1.5
>>>
>>> Tsearch2 is installed and runs well
>>>
>>> I'd like to use tsearch2 to index PDF files.
>>>
>>> Do someone has a detailed process to implement that?
>>
>>
>> --
>> Regards,
>> Hannes Dorbath
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org/
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2006-12-11 22:08:51 | Re: tsearch2 and pdf files |
Previous Message | John McCawley | 2006-12-11 21:47:07 | Re: Status of SSL encryption in ODBC driver |