From: | Chris Travers <chris(dot)travers(at)gmail(dot)com> |
---|---|
To: | Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> |
Cc: | tigran2-postgres(at)riatest(dot)com, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Storing large files in multiple schemas: BLOB or BYTEA |
Date: | 2012-10-11 09:34:15 |
Message-ID: | CAKt_Zft9Aqwc2pWoFr1BBWyt8GqugYX9u=ERhqD0YOd7icGzHw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Oct 10, 2012 at 10:56 PM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>wrote:
> On 10/11/2012 01:35 PM, tigran2-postgres(at)riatest(dot)com wrote:
>
>> Using files stored outside the database creates all sorts of problems.
>> For starters you lose ACID guaranties. I would prefer to keep them in
>> database. We did a lot of experiments with Large Objects and they really
>> worked fine (stored hundreds of LOBs ranging from a few MB up to 1GB).
>> Postgres does a really good job with Large Objects. If it was not the
>> pg_dump problem I would not hesitate to use LOBs.
>>
>
> Yeah, a pg_dump mode that dumped everything but large objects would be
> nice.
>
Remembering when that was the only way pg_dump worked and it caused plenty
of problems.
But yeah, --only-lobs and --no-lobs might be nice switches.
>
> Right now I find storing large objects in the DB such a pain from a backup
> management point of view that I avoid it where possible.
>
>
> I'm now wondering about the idea of implementing a pg_dump option that
> dumped large objects into a directory tree like
> lobs/[loid]/[lob_md5]
> and wrote out a restore script that loaded them using `lo_import`.
>
Thinking of the problems that occurred when we used to require lobs to be
backed up to binary archive formats....
> During dumping temporary copies could be written to something like
> lobs/[loid]/.tmp.nnnn with the md5 being calculated on the fly as the byte
> stream is read. If the dumped file had the same md5 as the existing one
> it'd just delete the tempfile; otherwise the tempfile would be renamed to
> the calculated md5.
>
>
> That way incremental backup systems could manage the dumped LOB tree
> without quite the same horrible degree of duplication as is currently faced
> when using lo in the database with pg_dump.
>
How do incremental backup systems work with lots of data anyway with
pg_dump? I would think thats not the approach I would take to incremental
backups and PostgreSQL....
>
> A last_modified timestamp on `pg_largeobject_metadata` would be even
> better, allowing the cost of reading and discarding rarely-changed large
> objects to be avoided.
>
It might be interesting to look at the issue of large objects from a total
backup perspective. I do wonder though where the end would be. You
could have 500MB text fields and those might pose backup issues as well. I
suppose with better LOB support on backups, it would give additional
options for management.
Best Wishes,
Chris Travers
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2012-10-11 09:34:34 | Re: moving from MySQL to pgsql |
Previous Message | JC de Villa | 2012-10-11 09:25:04 | Re: auto-increment field : in a simple way |