Re: PDF files: to store in database or not

From: Chris Travers <chris(dot)travers(at)gmail(dot)com>
To: Rich Shepard <rshepard(at)appl-ecosys(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PDF files: to store in database or not
Date: 2016-12-08 15:25:20
Message-ID: CAKt_Zfv8mDPYCMoZSsnsdx_QFY43LSveuX8OhfF-gtCUhZc-hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Dec 8, 2016 at 7:16 AM, Rich Shepard <rshepard(at)appl-ecosys(dot)com>
wrote:

> On Thu, 8 Dec 2016, John DeSoi wrote:
>
> I have been storing PDFs in Postgres for several years without any
>> problems. Documents range in size from a few pages to 100+ pages. I'm
>> using a bytea column, not large objects. I store the documents in a
>> separate database from the rest of the application data in order to make
>> it easy to exclude in database dumps or backup in some other way. I'm
>> currently managing about 600,000 documents.
>>
>
> John,
>
> This is really good information. Rather than using a separate database I
> think that storing all PDFs in a separate table makes sense for my
> application. Backup practices will be the domain of those using the
> application (which I've decided to open-source and give away because I'm
> not
> in the software business). A simple join to the appropriate data table will
> make them available.
>
> Not having used the bytea data type before I'll read how to work with it.
>

Assuming relatively small files, bytea makes much more sense than a large
object. However note that encoding and decoding can be relatively memory
intensive depending on your environment. This is not a problem with small
files and I would typically start to worry when you get into the hundreds
of mb in size. At least in Perl, I expect decoding to take about 8x the
size of the final file in RAM.

LOBs work best when you need a streaming interface (seek and friends) while
bytea's are otherwise much more pleasant to work with.

>
> Thanks very much for your insights,
>
> Rich
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rich Shepard 2016-12-08 16:11:12 Re: PDF files: to store in database or not
Previous Message Adrian Klaver 2016-12-08 15:21:31 Re: PDF files: to store in database or not