Re: Importing text file into a TEXT field

From: Adriana Alfonzo <adriana(dot)alfonzo(at)venalum(dot)com(dot)ve>
To: Bruno Lavoie <bruno(dot)lavoie(at)gmail(dot)com>
Cc: Thomas Kellerer <spam_eater(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: Importing text file into a TEXT field
Date: 2008-11-10 14:00:31
Message-ID: 49183E7F.7080200@venalum.com.ve
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

no quiero recibir mas correos de estos temas en donde no estoy incluida,
gracias.....

Bruno Lavoie escribió:
> Hello Thomas,
>
> nice tool!
>
> Thanks for your help... For batch and automation purposes, I've done a
> small script... (check out my last post in the thread few minutes ago)
>
> Bruno Lavoie
>
>
> Thomas Kellerer a écrit :
>> Bruno Lavoie, 07.11.2008 19:20:
>>> Hello,
>>>
>>> The intent is to use pdftotext and store the resulting text in
>>> datbase for full text search purposes... I'm trying to develop a
>>> mini content server where I'll put pdf documents to make it searchable.
>>>
>>> Generally, PDFs are in size of 500 to 3000 pages resulting in text
>>> from 500kb to 2megabytes...
>>>
>>> I'm also looking at open source projects like Alfresco if it can
>>> serve with ease to my purpose... Anyone use this one? Comments are
>>> welcome.
>>
>> If you are not bound to "native" Postgres tools, you might want to
>> take a look at my SQL Workbench/J (http://www.sql-workbench.net)
>>
>> It can insert the contents of files (located on the client) into
>> tables. You can either do this using an extended SQL syntax:
>> UPDATE pdf_table SET text_content = {$clobfile=c:/temp/convertet.txt
>> encoding=utf8}
>> WHERE id = 42;
>>
>> (of course this statement can not be run with psql)
>>
>> You could also bulk-upload several files at one using my flat-file
>> import.
>> (http://www.sql-workbench.net/manual/command-import.html)
>>
>> Assuming the table has two columns (id, text_content), the flat file
>> would look like this:
>>
>> id|text_content
>> 1|content_1.txt
>> 2|content_2.txt
>> 3|content_3.txt
>>
>> and the import would store the content of the files not the literl
>> 'content_1.txt' in the column text_content.
>>
>> You can either insert or update the content, depending on your needs.
>> You could even store the orginal pdf file if the table contains a
>> bytea column for the blob data.
>>
>> Contact me offline (contact information on my homepage) if you need
>> help.
>>
>> Regards
>> Thomas
>>
>>
>
>

Aviso Legal Este mensaje puede contener informacion de interes solo para CVG Venalum. Solo esta permitida su copia, distribucion o uso a personas autorizadas. Si recibio este corre por error, por favor destruyalo. Eventualmentew los correos electonicos pueden ser alterados. Al respecto, CVG Venalum no se hace responsable por los errores que pudieran afectar al mensaje original.

Attachment Content-Type Size
adriana_alfonzo.vcf text/x-vcard 293 bytes

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dmitry Teslenko 2008-11-10 14:22:29 LIKE, "=" and fixed-width character fields
Previous Message Andrus 2008-11-10 13:53:20 Re: Optimizing IN queries