Re: Index on (fixed size) bytea value

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Les <nagylzs(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Index on (fixed size) bytea value
Date: 2023-06-20 06:50:58
Message-ID: 335bca9eaf98a6865c8bfd8fc54b09caa055c7bf.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 2023-06-20 at 08:13 +0200, Les wrote:
> I'm aware of the TOAST, and how it works. I was referring to it ("I think that it should
> be as large as possible, without hitting the toast. ") I have designed a separate "block"
> table specifically to avoid storing binary data in the TOAST. So my plan is not going to
> involve out-of-band storage.
>
> Just to make this very clear: a record in the block table would store a block, not the
> whole file. My question is to finding the optimal block size (without hitting the toast),
> and finding the optimal hash algorithm for block de-duplication.

Then you would ALTER the column and SET STORAGE MAIN, so that it does not ever use TOAST.

The size limit for a row would then be 8kB minus page header minus row header, which
should be somewhere in the vicinity of 8140 bytes.

If you want your block size to be a power of two, the limit would be 4kB, which would waste
almost half your storage space.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Les 2023-06-20 07:18:31 Re: Index on (fixed size) bytea value
Previous Message Les 2023-06-20 06:13:07 Re: Index on (fixed size) bytea value