From: | Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com> |
Subject: | Re: refactoring basebackup.c |
Date: | 2021-10-20 12:48:55 |
Message-ID: | CAOgcT0OBD+GU4ZtY3wzzQng8ffjTP_WhXGnfUzrSN4vxDnzxmw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Robert,
I mean #define CHUNK_SIZE is still in the patch.
>
Oops, removed now.
> > > With just that change, you can set
> > > next_buf_len = LZ4F_HEADER_SIZE_MAX + mysink->output_buffer_bound --
> > > but that's also more than you need. You can instead do next_buf_len =
> > > Min(LZ4F_HEADER_SIZE_MAX, mysink->output_buffer_bound). Now, you're
> > > probably thinking that won't work, because bbsink_lz4_begin_archive()
> > > could fill up the buffer partway, and then the first call to
> > > bbsink_lz4_archive_contents() could overrun it. But that problem can
> > > be solved by reversing the order of operations in
> > > bbsink_lz4_archive_contents(): before you call LZ4F_compressUpdate(),
> > > test whether you need to empty the buffer first, and if so, do it.
> >
> > I am still not able to get - how can we survive with a mere
> > size of Min(LZ4F_HEADER_SIZE_MAX, mysink->output_buffer_bound).
> > LZ4F_HEADER_SIZE_MAX is defined as 19 in lz4 library. With this
> > proposal, it is almost guaranteed that the next buffer length will
> > be always set to 19, which will result in failure of a call to
> > LZ4F_compressUpdate() with the error LZ4F_ERROR_dstMaxSize_tooSmall,
> > even if we had called bbsink_archive_contents() before.
>
> Sorry, should have been Max(), not Min().
>
Ahh ok.
I looked into the code of LZ4F_compressBound() and the header size is
already included in the calculations, so when we compare
LZ4F_HEADER_SIZE_MAX and mysink->output_buffer_bound, the latter
will be always greater, and hence sufficient length for the output buffer. I
made this change in the attached patch, and also cleared the buffer by
calling bbsink_archive_contents() before calling LZ4_compressUpdate().
Similarly cleared the buffer before calling LZ4F_compressEnd().
> You mean the way gzip allows us to use our own alloc and free functions
> > by means of providing the function pointers for them. Unfortunately,
> > no, LZ4 does not have that kind of provision. Maybe that makes a
> > good proposal for LZ4 library ;-).
> > I cannot think of another solution to it right away.
>
> OK. Will give it some thought.
I have started a thread[1] on LZ4 community for this, but so far no
reply on that.
Regards,
Jeevan Ladhe
[1]
https://groups.google.com/g/lz4c/c/WnJkKwBWlcM/m/zszrla2mBQAJ?utm_medium=email&utm_source=footer
Attachment | Content-Type | Size |
---|---|---|
lz4_compress_v6.patch | application/octet-stream | 10.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ronan Dunklau | 2021-10-20 12:58:26 | Re: pg_receivewal starting position |
Previous Message | Andrew Dunstan | 2021-10-20 12:40:04 | Re: Postgres perl module namespace |