Re: How to store "blobs" efficiently for small and large sizes, with random access

From: Dominique Devienne <ddevienne(at)gmail(dot)com>
To: Andreas Joseph Krogh <andreas(at)visena(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: How to store "blobs" efficiently for small and large sizes, with random access
Date: 2022-10-19 10:48:24
Message-ID: CAFCRh-976riSPX=MVcN5a5KgGa1JUEmx3b3NUbGP2CCjtjyVag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh <andreas(at)visena(dot)com>
wrote:

> First advice, don't do it. We started off storing blobs in DB for “TX
> safety”
>

Not really an option, I'm afraid.

> , but backup/restore quickly became too cumbersome so we ended up moving
> all blobs out and only store reference in DB.
>
This required us to make a “vacuum system” that cleans up the blob-storage
> regularly as ROLLBACK/crash can make it out of sync.
>

Note the fact the data is spread in many mostly independent schemas.
Might ease the backup/restore? I'm not much of a DBA though...

> We chose storing as LO because with it, streaming large blobs (not using
> much memory) actually worked, with JDBC at least.
>

I'm in C++, with I believe efficient use of binary binds and results, and
use of COPY as much as possible,
so as good as it gets I think (that I know of, with libpq), in terms of
performance. Haven't looked at libpq's
new in v14 pipeline mode yet though.

Investigated Cursor vs Statement too, and it's a tradeoff between latency
and throughput.
Not sure how good JDBC drivers for PostgreSQL are. I'm doing my best to go
fast with libpq.

In any case, thanks for your input. But it's not really a question of "if".
But of "how".

Putting thousands of large blobs in the file system is a no go. Assuming
the clients
can even see the file system the server sees. This is a 2-tier system,
there's no mid-tier
that would somehow magically handle proper security and lifetime management
of these blobs.

Thanks, --DD

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Joseph Krogh 2022-10-19 11:00:42 Re: How to store "blobs" efficiently for small and large sizes, with random access
Previous Message Andreas Joseph Krogh 2022-10-19 10:17:10 Sv: How to store "blobs" efficiently for small and large sizes, with random access