Re: Best Strategy for Large Number of Images

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Estevan Rech <softrech(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Rob Sargent <robjsargent(at)gmail(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Best Strategy for Large Number of Images
Date: 2021-12-15 22:51:48
Message-ID: CAKFQuwaM5aq2x5DMv5Hw7f1Dvwtw3YBpRXEGt-HGwoNkiCc_Dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Dec 15, 2021 at 1:12 PM Estevan Rech <softrech(at)gmail(dot)com> wrote:

>
> Related to this is the backup service that seems to me that in a disaster,
> I can restore the database relatively quickly if it's in the database. On
> the disk, I believe that the restoration is much slower and I don't think
> it's reliable that all the images are copied/restored.
>

Everything ends up "on disk" at some point - whether it's a database or a
simple file server - so speed and reliability would generally be a wash
conceptually.

Honestly, with that more complete description (and the assumption these
images are primary work products that are being paid for), I would say you
should be doing both. Use something like AWS S3 to store a copy - saving
its path into the DB as metadata - while also leaving a copy within the
database. As mentioned nearby, also generate a hash and store that.
Institute a rule that once an image has been generated it is never
altered. If you need to deal with editing, create a new record, and maybe
record the "source image hash" so you can build a chain if desired.

I'd probably store the image data on a separate table than the metadata -
in a one-to-one relationship. Maybe even partition the image file table.
That lets you drop images from the DB if desired - with the option to
restore them from the AWS S3 system if needed. You can over-engineer
things this way but I do think that having two primary sources for the
images, plus their backups, is reasonable for the types of images you are
dealing with.

David J.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Matt Magoffin 2021-12-15 22:54:33 Re: Properly handling aggregate in nested function call
Previous Message David G. Johnston 2021-12-15 22:31:50 Re: Why can't I have a "language sql" anonymous block?