From: | David Wall <d(dot)wall(at)computer(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Storing many big files in database- should I do it? |
Date: | 2010-04-29 16:07:52 |
Message-ID: | 4BD9AED8.5060502@computer.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Things to consider when /not /storing them in the DB:
1) Backups of DB are incomplete without a corresponding backup of the files.
2) No transactional integrity between filesystem and DB, so you will
have to deal with orphans from both INSERT and DELETE (assuming you
don't also update the files).
3) No built in ability for replication, such as WAL shipping
Big downside for the DB is that all large objects appear to be stored
together in pg_catalog.pg_largeobject, which seems axiomatically
troubling that you know you have lots of big data, so you then store
them together, and then worry about running out of 'loids'.
David
On 4/29/2010 2:10 AM, Cédric Villemain wrote:
> 2010/4/28 Adrian Klaver<adrian(dot)klaver(at)gmail(dot)com>:
>
>> On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:
>>
>>> On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain<
>>>
>>> cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>>>
>>>> store your files in a filesystem, and keep the path to the file (plus
>>>> metadata, acl, etc...) in database.
>>>>
>>> What type of filesystem is good for this? A filesystem with support for
>>> storing tens of thousands of files in a single directory, or should one
>>> play the 41/56/34/41563489.ext game?
>>>
> I'll prefer go with XFS or ext{3-4}. In both case with a path game.
> You path game will let you handle the scalability of your uploads. (so
> the first increment is the first directory) something like
> 1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
> function or something that split a SHA1(or other) sum of the file to
> get the path.
>
>
>
>>> Are there any open source systems which handle keeping a filesystem and
>>> database in sync for this purpose, or is it a wheel that keeps getting
>>> reinvented?
>>>
>>> I know "store your files in a filesystem" is the best long-term solution.
>>> But it's just so much easier to just throw everything in the database.
>>>
>> In the for what it is worth department check out this Wiki:
>> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems
>>
> and postgres fuse also :-D
>
>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)gmail(dot)com
>>
>>
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Beevee | 2010-04-29 16:29:49 | Select with string that has a lone hyphen yields nothing |
Previous Message | Guillaume Lelarge | 2010-04-29 15:01:18 | Re: Start-up script for few clusters: just add water? |