Re: Storing large files in multiple schemas: BLOB or BYTEA

From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: tigran2-postgres(at)riatest(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Storing large files in multiple schemas: BLOB or BYTEA
Date: 2012-10-11 05:56:08
Message-ID: 50765F78.7040202@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 10/11/2012 01:35 PM, tigran2-postgres(at)riatest(dot)com wrote:
> Using files stored outside the database creates all sorts of problems.
> For starters you lose ACID guaranties. I would prefer to keep them in
> database. We did a lot of experiments with Large Objects and they really
> worked fine (stored hundreds of LOBs ranging from a few MB up to 1GB).
> Postgres does a really good job with Large Objects. If it was not the
> pg_dump problem I would not hesitate to use LOBs.

Yeah, a pg_dump mode that dumped everything but large objects would be
nice.

Right now I find storing large objects in the DB such a pain from a
backup management point of view that I avoid it where possible.

I'm now wondering about the idea of implementing a pg_dump option that
dumped large objects into a directory tree like
lobs/[loid]/[lob_md5]
and wrote out a restore script that loaded them using `lo_import`.

During dumping temporary copies could be written to something like
lobs/[loid]/.tmp.nnnn with the md5 being calculated on the fly as the
byte stream is read. If the dumped file had the same md5 as the existing
one it'd just delete the tempfile; otherwise the tempfile would be
renamed to the calculated md5.

That way incremental backup systems could manage the dumped LOB tree
without quite the same horrible degree of duplication as is currently
faced when using lo in the database with pg_dump.

A last_modified timestamp on `pg_largeobject_metadata` would be even
better, allowing the cost of reading and discarding rarely-changed large
objects to be avoided.

--
Craig Ringer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vineet Deodhar 2012-10-11 06:07:31 Re: moving from MySQL to pgsql
Previous Message tigran2-postgres 2012-10-11 05:35:54 Re: Storing large files in multiple schemas: BLOB or BYTEA