From: | Bruno Lavoie <bruno(dot)lavoie(at)gmail(dot)com> |
---|---|
To: | Thomas Kellerer <spam_eater(at)gmx(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Importing text file into a TEXT field |
Date: | 2008-11-10 14:39:00 |
Message-ID: | 49184784.70002@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello Thomas,
nice tool!
Thanks for your help... For batch and automation purposes, I've done a
small script... (check out my last post in the thread few minutes ago)
Bruno Lavoie
Thomas Kellerer a écrit :
> Bruno Lavoie, 07.11.2008 19:20:
>> Hello,
>>
>> The intent is to use pdftotext and store the resulting text in
>> datbase for full text search purposes... I'm trying to develop a mini
>> content server where I'll put pdf documents to make it searchable.
>>
>> Generally, PDFs are in size of 500 to 3000 pages resulting in text
>> from 500kb to 2megabytes...
>>
>> I'm also looking at open source projects like Alfresco if it can
>> serve with ease to my purpose... Anyone use this one? Comments are
>> welcome.
>
> If you are not bound to "native" Postgres tools, you might want to
> take a look at my SQL Workbench/J (http://www.sql-workbench.net)
>
> It can insert the contents of files (located on the client) into
> tables. You can either do this using an extended SQL syntax:
> UPDATE pdf_table SET text_content = {$clobfile=c:/temp/convertet.txt
> encoding=utf8}
> WHERE id = 42;
>
> (of course this statement can not be run with psql)
>
> You could also bulk-upload several files at one using my flat-file
> import.
> (http://www.sql-workbench.net/manual/command-import.html)
>
> Assuming the table has two columns (id, text_content), the flat file
> would look like this:
>
> id|text_content
> 1|content_1.txt
> 2|content_2.txt
> 3|content_3.txt
>
> and the import would store the content of the files not the literl
> 'content_1.txt' in the column text_content.
>
> You can either insert or update the content, depending on your needs.
> You could even store the orginal pdf file if the table contains a
> bytea column for the blob data.
>
> Contact me offline (contact information on my homepage) if you need help.
>
> Regards
> Thomas
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Huxton | 2008-11-10 15:14:22 | Re: LIKE, "=" and fixed-width character fields |
Previous Message | Bruno Lavoie | 2008-11-10 14:37:16 | Re: Importing text file into a TEXT field |