| From: | John R Pierce <pierce(at)hogranch(dot)com> | 
|---|---|
| To: | pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: Storing many big files in database- should I do it? | 
| Date: | 2010-04-27 08:54:15 | 
| Message-ID: | 4BD6A637.8030902@hogranch.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Rod wrote:
> Hello,
>
> I have a web application where users upload/share files.
> After file is uploaded it is copied to S3 and all subsequent downloads
> are done from there.
> So in a file's lifetime it's accessed only twice- when created and
> when copied to S3.
>
> Files are documents, of different size from few kilobytes to 200
> Megabytes. Number of files: thousands to hundreds of thousands.
>
> My dilemma is - Should I store files in PGSQL database or store in
> filesystem and keep only metadata in database?
>
> I see the possible cons of using PGSQL as storage:
> - more network bandwidth required comparing to access NFS-mounted filesystem ?
> - if database becomes corrupt you can't recover individual files
> - you can't backup live database unless you install complicated
> replication add-ons
> - more CPU required to store/retrieve files (comparing to filesystem access)
> - size overhead, e.g. storing 1000 bytes will take 1000 bytes in
> database + 100 bytes for db metadata, index, etc. with lot of files
> this will be a lot of overhead.
>
> Are these concerns valid?
> Anyone had this kind of design problem and how did you solve it?
>   
S3 storage is not suitable for running a RDBMS.
An RDBMS wants fast low latency storage using 8k block random reads and 
writes.  S3 is high latency and oriented towards streaming
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John R Pierce | 2010-04-27 08:58:24 | Re: PostgreSQL Performance issue | 
| Previous Message | Magnus Hagander | 2010-04-27 08:48:43 | Re: PostgreSQL Performance issue |