From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | gkokolatos(at)pm(dot)me |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com> |
Subject: | Re: Add LZ4 compression in pg_dump |
Date: | 2023-01-14 21:43:09 |
Message-ID: | 20230114214308.GC9837@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
> On Thu, Dec 22, 2022 at 11:08:59AM -0600, Justin Pryzby wrote:
> > There's a couple of lz4 bits which shouldn't be present in 002: file
> > extension and comments.
> BTW I noticed that cfdopen() was accidentally committed to compress_io.h
> in master without being defined anywhere.
This was resolved in 69fb29d1a (so now needs to be re-added for this
patch series).
> pg_compress_specification is being passed by value, but I think it
> should be passed as a pointer, as is done everywhere else.
ISTM that was an issue with 5e73a6048, affecting a few public and
private functions. I wrote a pre-preparatory patch which changes to
pass by reference.
And addressed a handful of other issues I reported as separate fixup
commits. And changed to use LZ4 by default for CI.
I also rebased my 2 year old patch to support zstd in pg_dump. I hope
it can finally added for v16. I'll send it for the next CF if these
patches progress.
One more thing: some comments still refer to the cfopen API, which this
patch removes.
> There were "LZ4" comments and file extension stuff in the preparatory
> commit. But now it seems like you *removed* them in the LZ4 commit
> (where it actually belongs) rather than *moving* it from the
> prior/parent commit *to* the lz4 commit. I recommend to run something
> like "git diff @{1}" whenever doing this kind of patch surgery.
TODO
> Maybe other places that check if (compression==PG_COMPRESSION_GZIP)
> should maybe change to say compression!=NONE?
>
> _PrepParallelRestore() references ".gz", so I think it needs to be
> retrofitted to handle .lz4. Ideally, that's built into a struct or list
> of file extensions to try. Maybe compression.h should have a function
> to return the file extension of a given algorithm. I'm planning to send
> a patch for zstd, and hoping its changes will be minimized by these
> preparatory commits.
TODO
> I think it's confusing to have two functions, one named
> InitCompressLZ4() and InitCompressorLZ4().
TODO
> pg_compress_algorithm is being writen directly into the pg_dump header.
> Currently, I think that's not an externally-visible value (it could be
> renumbered, theoretically even in a minor release). Maybe there should
> be a "private" enum for encoding the pg_dump header, similar to
> WAL_COMPRESSION_LZ4 vs BKPIMAGE_COMPRESS_LZ4 ? Or else a comment there
> should warn that the values are encoded in pg_dump, and must never be
> changed.
Michael, WDYT ?
--
Justin
Attachment | Content-Type | Size |
---|---|---|
0001-pg_dump-pass-pg_compress_specification-as-a-pointer.patch | text/x-diff | 11.9 KB |
0002-Prepare-pg_dump-internals-for-additional-compression.patch | text/x-diff | 23.5 KB |
0003-Introduce-Compressor-API-in-pg_dump.patch | text/x-diff | 65.5 KB |
0004-f.patch | text/x-diff | 1.3 KB |
0005-Add-LZ4-compression-in-pg_-dump-restore.patch | text/x-diff | 28.8 KB |
0006-f.patch | text/x-diff | 6.8 KB |
0007-TMP-pg_dump-use-lz4-by-default-for-CI-only.patch | text/x-diff | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Mikhail Gribkov | 2023-01-14 21:56:06 | Re: On login trigger: take three |
Previous Message | Tom Lane | 2023-01-14 21:23:47 | Re: Removing redundant grouping columns |