From: | Denis Smirnov <sd(at)arenadata(dot)io> |
---|---|
To: | Konstantin Knizhnik <knizhnik(at)garret(dot)ru> |
Cc: | Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: libpq compression |
Date: | 2020-12-17 13:39:50 |
Message-ID: | CCAB1F57-A71D-4DEC-9A9C-F325B44BAC29@arenadata.io |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello all,
I’ve finally read the whole thread (it was huge). It is extremely sad that this patch hang without progress for such a long time. It seems that the main problem in discussion is that everyone has its own view what problems should be solve with this patch. Here are some of positions (not all of them):
1. Add a compression for networks with a bad bandwidth (and make a patch as simple and maintainable as possible) - author’s position.
2. Don’t change current network protocol and related code much.
3. Refactor compression API (and network compression as well)
4. Solve cloud provider’s problems: on demand buy network bandwidth with CPU utilisation and vice versa.
All of these requirements have a different nature and sometimes conflict with each other. Without clearly formed requirements this patch would never be released.
Anyway, I have rebased it to the current master branch, applied pgindent, tested on MacOS and fixed a MacOS specific problem with strcpy in build_compressors_list(): it has an undefined behaviour when source and destination strings overlap.
- *client_compressors = src = dst = strdup(value);
+ *client_compressors = src = strdup(value);
+ dst = strdup(value);
According to my very simple tests with randomly generated data, zstd gives about 3x compression (zlib has a little worse compression ratio and a little bigger CPU utilisation). It seems to be a normal ratio for any streaming data - Greenplum also uses zstd/zlib to compress append optimised tables and compression ratio is usually about 3-5x. Also according to my Greenplum experience, the most commonly used zstd ratio is 1, while for zlib it is usually in a range of 1-5. CPU and execution time were not affected much according to uncompressed data (but my tests were very simple and they should not be treated as reliable).
Attachment | Content-Type | Size |
---|---|---|
0001-Rebase-patch-27-to-actual-master-and-fix-strcpy.patch.txt | text/plain | 78.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Seino Yuki | 2020-12-17 13:59:20 | Re: Feature improvement for pg_stat_statements |
Previous Message | Konstantin Knizhnik | 2020-12-17 13:05:18 | Re: On login trigger: take three |