Re: what is the best way of storing text+image documents in postgresql

From: John R Pierce <pierce(at)hogranch(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: what is the best way of storing text+image documents in postgresql
Date: 2011-06-08 19:37:51
Message-ID: 4DEFCF8F.1030900@hogranch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 06/08/11 6:06 AM, Craig Ringer wrote:
>> 1. save .doc documents in bytea columns. and show them with a word
>> reader in web page (disadvantage: it needs a proper .doc reader
>> installed on user computer)
>
> 1a: Convert the .doc files to a standard format like PDF that most
> browsers can display. That's what I'd do.

thats harder to integrate with a website in the sense that the PDF
documents are hard page formatted, and can at best be displayed in an
<iframe> within your site, and half the time, only displayed in an
external PDF viewer since browser-pdf integration remains flakey and
bugridden after all these years. PDF text won't flow to fit your page
layout, etc etc.

one approach to conversion might be to save the documents as an RTF type
format, and run that through a preprocessor that reencodes them as a
clean HTML or similar metalanguage that you can deal with intelligently.
MS Word's own HTML converter creates wretched HTML with tons of
extra bizarro-world tags which likely will trip up your page formatting
if you display these in context in your pages.

as Craig suggested, images can be stored as bytea objects, and the image
links converted to point to a CGI that fetches them from the database
and presents them to the client browser.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Johnston 2011-06-08 20:06:17 Re: Converting uuid primary key column to serial int
Previous Message Isak Hansen 2011-06-08 19:07:12 Re: Best Practices - Securing an Enterprise application using JBOSS & Postgres