From: | gkokolatos(at)pm(dot)me |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, shiy(dot)fnst(at)fujitsu(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com> |
Subject: | Re: Add LZ4 compression in pg_dump |
Date: | 2023-03-01 13:39:14 |
Message-ID: | lsZgBfZRB5w5slcXnKwoL9qgpzdlAC_UwYHXfbj1oioWsNkckwr2BcLvvCi9-x7eV261s8AP5OHwROZ-QYtpeixxt73DQ_d6-LNfghYOaIQ=@pm.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
------- Original Message -------
On Wednesday, March 1st, 2023 at 12:58 AM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> I found that e9960732a broke writing of empty gzip-compressed data,
> specifically LOs. pg_dump succeeds, but then the restore fails:
>
> postgres=# SELECT lo_create(1234);
> lo_create | 1234
>
> $ time ./src/bin/pg_dump/pg_dump -h /tmp -d postgres -Fc |./src/bin/pg_dump/pg_restore -f /dev/null -v
> pg_restore: implied data-only restore
> pg_restore: executing BLOB 1234
> pg_restore: processing BLOBS
> pg_restore: restoring large object with OID 1234
> pg_restore: error: could not uncompress data: (null)
>
Thank you for looking. This was an untested case.
> The inline patch below fixes it, but you won't be able to apply it
> directly, as it's on top of other patches which rename the functions
> back to "Zlib" and rearranges the functions to their original order, to
> allow running:
>
> git diff --diff-algorithm=minimal -w e9960732a~:./src/bin/pg_dump/compress_io.c ./src/bin/pg_dump/compress_gzip.c
>
Please find a patch attached that can be applied directly.
> The current function order avoids 3 lines of declarations, but it's
> obviously pretty useful to be able to run that diff command. I already
> argued for not calling the functions "Gzip" on the grounds that the name
> was inaccurate.
I have no idea why we are back on the naming issue. I stand by the name
because in my humble opinion helps the code reader. There is a certain
uniformity when the compression_spec.algorithm and the compressor
functions match as the following code sample shows.
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
InitCompressorLZ4(cs, compression_spec);
When the reader wants to see what happens when the PG_COMPRESSION_XXX
is set, has to simply search for the XXX part. I think that this is
justification enough for the use of the names.
>
> I'd want to create an empty large object in src/test/sql/largeobject.sql
> to exercise this tested during pgupgrade. But unfortunately that
> doesn't use -Fc, so this isn't hit. Empty input is an important enough
> test case to justify a tap test, if there's no better way.
Please find in the attached a test case that exercises this codepath.
Cheers,
//Georgios
Attachment | Content-Type | Size |
---|---|---|
0001-Properly-gzip-compress-when-no-data-is-available.patch | text/x-patch | 7.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeroen Vermeulen | 2023-03-01 14:23:45 | Re: libpq: PQgetCopyData() and allocation overhead |
Previous Message | Önder Kalacı | 2023-03-01 13:21:52 | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |