Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ

From: Sasasu <i(at)sasa(dot)su>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, sfrost(at)snowman(dot)net
Subject: Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ
Date: 2021-11-30 05:55:29
Message-ID: cc3110c5-b15c-3fa5-ad83-9dd7193d10a8@sasa.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/11/29 18:05, Antonin Houska wrote:
> Does this test really pass regression tests? In BufFileRead(), I would
> understand if you did
>
> + file->pos = offsetInBlock;
> + file->curOffset -= offsetInBlock;
>
> rather than
>
> + file->pos += offsetInBlock;
> + file->curOffset -= offsetInBlock;

It pass all regression tests. this patch is compatible with
BufFileSeek().

to generate a correct alignment, we need to make sure
pos_new + offset_new = pos_old + offset_old
offset_new = offset_old - offset_old % BLCKSZ
it means
pos_new = pos_old + offset_old % BLCKSZ
= pos_old + "offsetInBlock"

with your code, backend will read a wrong buffile at the end of buffile
reading. for example: physical file size = 20 and pos = 10, off = 10,
read start at 20. after the '=' code: pos = 10, off = 0, read start at
10, which is wrong.

> Anyway, BufFileDumpBuffer() does not seem to enforce curOffset to end up at
> block boundary, not to mention BufFileSeek().
>
> When I was implementing this for our fork [1], I concluded that the encryption
> code path is too specific, so I left the existing code for the unecrypted data
> and added separate functions for the encrypted data.
>
> One specific thing is that if you encrypt and write n bytes, but only need
> part of it later, you need to read and decrypt exactly those n bytes anyway,
> otherwise the decryption won't work. So I decided not only to keep curOffset
> at BLCKSZ boundary, but also to read / write BLCKSZ bytes at a time. This also
> makes sense if the scope of the initialization vector (IV) is BLCKSZ bytes.
>
> Another problem is that you might want to store the IV somewhere in between
> the data. In short, the encryption makes the buffered IO rather different and
> the specific code should be kept aside, although the same API is used to
> invoke it.
>
but I want to make less change on existed code. with this path. the only
code added to critical code path is this:

diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index 3be08eb723..ceae85584b 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -512,6 +512,9 @@ BufFileDumpBuffer(BufFile *file)
/* and the buffer is aligned with BLCKSZ */
Assert(file->curOffset % BLCKSZ == 0);

+ /* encrypt before write */
+ TBD_ENC(file->buffer.data + wpos /* buffer */, bytestowrite /* size */, file->curOffset /* context to find IV */);
+
thisfile = file->files[file->curFile];
bytestowrite = FileWrite(thisfile,
file->buffer.data + wpos,
@@ -582,6 +585,9 @@ BufFileRead(BufFile *file, void *ptr, size_t size)
BufFileLoadBuffer(file);
if (file->nbytes <= 0 || (file->nbytes == file->pos && file->nbytes != BLCKSZ))
break; /* no more data available */
+
+ /* decrypt after read */
+ TBD_DEC(file->buffer /* buffer */, file->nbytes /* size */, file->curOffset /* context to find IV */);
}

nthistime = file->nbytes - file->pos;

those change will allow TDE to use any encryption algorithm (read offset
and write offset are matched) and implement on-the-fly IV generation.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-11-30 06:00:28 Re: parallel vacuum comments
Previous Message houzj.fnst@fujitsu.com 2021-11-30 05:33:09 RE: parallel vacuum comments