From: | Andrew Chernow <pg-job(at)esilo(dot)com> |
---|---|
To: | Jorge Godoy <jgodoy(at)gmail(dot)com> |
Cc: | John McCawley <nospam(at)hardgeus(dot)com>, Clodoaldo <clodoaldo(dot)pinto(dot)neto(at)gmail(dot)com>, imageguy <imageguy1206(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Database versus filesystem for storing images |
Date: | 2007-01-05 21:42:33 |
Message-ID: | 459EC649.7020301@esilo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> And how do you guarantee that after a failure? You're restoring two
> different sets of data here:
> How do you link them together on that specific operation? Or even on a daily
> basis, if you get corrupted data...
I answered that already.
> Not counting that depending on your choice of filesystem and image size you
> might get a very poor performance.
apache has very good page and image caching. You could take advantage of that
using this technique.
Another nice feature is the database and images can be handled spearately. Some
people have seen this as a disadvantage on this thread, I personally don't see
it that why.
I guess it depends on access needs, many files and how much data you have. What
if you had 3 billion files across a few hundred terabytes? Can you say with
experience how the database would hold up in this situation?
andrew
Jorge Godoy wrote:
> Andrew Chernow <pg-job(at)esilo(dot)com> writes:
>
>>> I mean, how do you handle integrity with data
>>> outside the database?
>> You don't, the file system handles integrity of the stored data. Although,
>> one must careful to avoid db and fs orphans. Meaning, a record with no
>> corresponding file or a file with no corresponging record. Always
>> write()/insert an image file to the system within a transaction, including
>> writing the image out to the fs. Make sure to unlink any paritally written
>> image files.
>
> And how do you guarantee that after a failure? You're restoring two
> different sets of data here:
>
> - backup from your database
> - backup from your files
>
> How do you link them together on that specific operation? Or even on a daily
> basis, if you get corrupted data...
>
>>>> How do you plan your backup routine
>> In regards to backup, backup the files one-by-one. Grab the lastest image
>> file refs from the database and start backing up those images. Each
>> successfully backed up image should be followed by inserting that file's
>> database record into a remote db server. If anything fails, cleanup the
>> partial image file (to avoid orphaned data) and rollout the transaction.
>>
>> just one idea. i'm sure there are other ways of doing it. point is, this is
>> completely possible to do reliably.
>
> Wouldn't replication with, e.g., Slony be easier? And wouldn't letting the
> database handle all the integrity be easier? I mean, create an "images" table
> and then make your record depends on this table, so if there's no record with
> the image, you won't have any references to it left.
>
> It would also make the backup plan easier: backup the database.
>
> Not counting that depending on your choice of filesystem and image size you
> might get a very poor performance.
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Bruno Wolff III | 2007-01-05 21:51:59 | Re: Database versus filesystem for storing images |
Previous Message | goodepic | 2007-01-05 21:31:19 | Re: PyGreSQL Install |