From: | Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | Konstantin Knizhnik <knizhnik(at)garret(dot)ru> |
Subject: | Re: libpq compression |
Date: | 2021-03-18 19:30:09 |
Message-ID: | 161609580905.28624.5304095609680400810.pgcf@coridan.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed
Hi,
I've compared the different libpq compression approaches in the streaming physical replication scenario.
Test setup
Three hosts: first is used for pg_restore run, second is master, third is the standby replica.
In each test run, I've run the pg_restore of the IMDB database (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2QYZBT)
and measured the received traffic on the standby replica.
Also, I've enlarged the ZPQ_BUFFER_SIZE buffer in all versions because too small buffer size (8192 bytes) lead to more
system calls to socket read/write and poor compression in the chunked-reset scenario.
Scenarios:
chunked
use streaming compression, wrap compressed data into CompressedData messages and preserve the compression context between multiple CompressedData messages.
https://github.com/usernamedt/libpq_compression/tree/chunked-compression
chunked-reset
use streaming compression, wrap compressed data into CompressedData messages and reset the compression context on each CompressedData message.
https://github.com/usernamedt/libpq_compression/tree/chunked-reset
permanent
use streaming compression, send raw compressed stream without any wrapping
https://github.com/usernamedt/libpq_compression/tree/permanent-w-enlarged-buffer
Tested compression levels
ZSTD, level 1
ZSTD, level 5
ZSTD, level 9
Scenario Replica rx, mean, MB
uncompressed 6683.6
ZSTD, level 1
Scenario Replica rx, mean, MB
chunked-reset 2726
chunked 2694
permanent 2694.3
ZSTD, level 5
Scenario Replica rx, mean, MB
chunked-reset 2234.3
chunked 2123
permanent 2115.3
ZSTD, level 9
Scenario Replica rx, mean, MB
chunked-reset 2153.6
chunked 1943
permanent 1941.6
Full report with additional data and resource usage graphs is available here
https://docs.google.com/document/d/1a5bj0jhtFMWRKQqwu9ag1PgDF5fLo7Ayrw3Uh53VEbs
Based on these results, I suggest sticking with chunked compression approach
which introduces more flexibility and contains almost no overhead compared to permanent compression.
Also, later we may introduce some setting to control should we reset the compression context in each message without
breaking the backward compatibility.
--
Daniil Zakhlystov
The new status of this patch is: Ready for Committer
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2021-03-18 19:57:21 | Re: [HACKERS] Custom compression methods |
Previous Message | Tomas Vondra | 2021-03-18 19:27:40 | Re: GROUP BY DISTINCT |