Re: Add LZ4 compression in pg_dump

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: gkokolatos(at)pm(dot)me
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-01-25 01:42:03
Message-ID: 20230125014203.GL13860@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 24, 2023 at 03:56:20PM +0000, gkokolatos(at)pm(dot)me wrote:
> On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos(at)pm(dot)me wrote:
> >
> > > Please find attached v23 which reintroduces the split.
> > >
> > > 0001 is reworked to have a reduced footprint than before. Also in an attempt
> > > to facilitate the readability, 0002 splits the API's and the uncompressed
> > > implementation in separate files.
> >
> > Thanks for updating the patch. Could you address the review comments I
> > sent here ?
> > https://www.postgresql.org/message-id/20230108194524.GA27637%40telsasoft.com
>
> Please find v24 attached.

Thanks for updating the patch.

In 001, RestoreArchive() does:

> -#ifndef HAVE_LIBZ
> - if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
> - AH->PrintTocDataPtr != NULL)
> + supports_compression = false;
> + if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
> + AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> + supports_compression = true;
> +
> + if (AH->PrintTocDataPtr != NULL)
> {
> for (te = AH->toc->next; te != AH->toc; te = te->next)
> {
> if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
> - pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
> + {
> +#ifndef HAVE_LIBZ
> + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> + supports_compression = false;
> +#endif
> + if (supports_compression == false)
> + pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
> + }
> }
> }
> -#endif

This first checks if the algorithm is implemented, and then checks if
the algorithm is supported by the current build - that confused me for a
bit. It seems unnecessary to check for unimplemented algorithms before
looping. That also requires referencing both GZIP and LZ4 in two
places.

I think it could be written to avoid the need to change for added
compression algorithms:

+ if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
+ {
+ /* Check if the compression algorithm is supported */
+ pg_compress_specification spec;
+ parse_compress_specification(AH->compression_spec.algorithm, NULL, &spec);
+ if (spec->parse_error != NULL)
+ pg_fatal(spec->parse_error);
+ }

Or maybe add a new function to compression.c to indicate whether a given
algorithm is supported.

That would also indicate *which* compression library isn't supported.

Other than that, I think 001 is ready.

002/003 use these names, which I think are too similar - initially I
didn't even realize there were two separate functions (each with a
second stub function to handle the case of unsupported compression):

+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);

+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);

typo:
s/not build with/not built with/

Should AllocateCompressor() set cs->compression_spec, rather than doing
it in each compressor ?

Thanks for considering.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-01-25 01:42:05 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Justin Pryzby 2023-01-25 01:37:29 Re: Improve logging when using Huge Pages