Re: To BLOB Or Not To BLOB

From: "Robert B(dot) Easter" <reaster(at)comptechnews(dot)com>
To: Frank Joerdens <frank(at)joerdens(dot)de>, pgsql-general(at)postgresql(dot)org
Subject: Re: To BLOB Or Not To BLOB
Date: 2000-04-16 20:11:56
Message-ID: 00041617175901.00304@comptechnews
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, 16 Apr 2000, Frank Joerdens wrote:
> A while ago it was being held that the Postgres large object data type
> was too new and not sufficiently tested and mature to be used in a
> production environment. I am about to deploy a little database that
> involves storing large-ish text files (20-500k) which could be either done
> as large objects or as pointers to those files that would be then stored
> as ordinary files in the OS's filesystem. I am undecided as to this
> question. What are the pros and cons? What is the performance in either
> case vis-a-vis the other? It is a web app; the interface is done in PHP.
> Beginning from which version is the large object interface (if at all)
> to be considered stable and ready for production?
>

How are you going to dump and restore the large objects? The pg_dump and
pg_dumpall programs will only handle the tables it seems.

There are some programs out there to dump large objects and I've been playing
with one. It's worked well so far. You can get it at
ftp://ftp2.zf.jcu.cz/zakkr/pg/

I'd like to know what other ways there are to handle dumping/restoring large
objects. Isn't there any standard way that comes with PostgreSQL?

Nevertheless, I've been using php and large objects with no problems so far.
I've been able to fully dump the database with all large objects, upgrade
PostgreSQL to a new version, initdb, and restore everything.

Reading large objects seems fairly fast too. On my website, all data is
stored in the database, including pictures. The version that is up and running
reads the data out of the database on every request! Since I use query
string urls like ?image=2345, web browsers don't cache it much. Its not so slow
if you don't have a lot of traffic, but if you are planning for a lot of
traffic, try designing for on-the-fly content generation for a speed up. That
is what I am doing on the devel version of my site. This way the data is only
read out of the database on the first request then html and image files are
generated on the filesystem to serve the same request next time unless a query
string is passed which overrides and forces content regeneration. Its a little
tricky but cool.

I'd say go for storing everything in the database so long as you master how to
dump and restore large objects as well as the databases/tables using
pg_dump/pg_dumpall. This way, it is possible to really separate your site code
from its data. All data backed up in the database and its dumps, all code (php
etc) backed up in a CVS repository. Html pages and images generated on the
filesystem as an ordinary looking website can be seen as just temporary cached
data that can be sent out quickly when it matches a request. Look into
mod_rewrite on apache. If you want a simpler site that serves fast,
then just use the file pointer idea and keep the files outside the database.

Robert Easter
reaster(at)comptechnews(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Titus Brown 2000-04-16 20:17:23 error message help?
Previous Message Mike Mascari 2000-04-16 18:17:56 Re: Invisible tables