From: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Даниил Захлыстов <usernamedt(at)yandex-team(dot)ru> |
Subject: | Re: libpq compression |
Date: | 2020-10-31 19:08:45 |
Message-ID: | 0ae2162b-818f-6a6d-cca6-ee4e73733e8e@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi
On 31.10.2020 00:03, Andres Freund wrote:
> Hi,
>
> On 2020-10-29 16:45:58 +0300, Konstantin Knizhnik wrote:
>>> - What does " and it is up to the client whether to continue work
>>> without compression or report error" actually mean for a libpq parameter?
>> It can not happen.
>> The client request from server use of compressed protocol only if
>> "compression=XXX" was specified in connection string.
>> But XXX should be supported by client, otherwise this request will be
>> rejected.
>> So supported protocol string sent by client can never be empty.
> I think it's pretty important for features like this to be able to fail
> softly when the feature is not available on the other side. Otherwise a
> lot of applications need to have unnecessary version dependencies coded
> into them.
Sorry, may be I do not completely understand your suggestion.
Right now user jut specify that he wants to use compression.
Libpq client sends to the server list of supported algorithms
and server choose among them the best one is supported.
It sends it chose to the client and them are both using this algorithm.
Sorry, that in previous mail I have used incorrect samples: client is
not explicitly specifying compression algorithm -
it just request compression. And server choose the most efficient
algorithm which is supported both by client and server.
So client should not know names of the particular algorithms (i.e. zlib,
zstd) and
choice is based on the assumption that server (or better say programmer)
knows better than user which algorithms is more efficient.
Last assumption me be contested because user better know which content
will be send and which
algorithm is more efficient for this content. But right know when the
choice is between zstd and zlib,
the first one is always better: faster and provides better quality.
>
>>> - What is the point of the "streaming mode" reference?
>> There are ways of performing compression:
>> - block mode when each block is individually compressed (compressor stores
>> dictionary in each compressed blocked)
>> - stream mode
>> Block mode allows to independently decompress each page. It is good for
>> implementing page or field compression (as pglz is used to compress toast
>> values).
>> But it is not efficient for compressing client-server protocol commands.
>> It seems to me to be important to explain that libpq is using stream mode
>> and why there is no pglz compressor
> To me that seems like unnecessary detail in the user oriented parts of
> the docs at least.
>
Ok, I will remove this phrase.
>>> Why are compression methods identified by one byte identifiers? That
>>> seems unnecessarily small, given this is commonly a once-per-connection
>>> action?
>> It is mostly for simplicity of implementation: it is always simple to work
>> with fixed size messages (or with array of chars rather than array of
>> strings).
>> And I do not think that it can somehow decrease flexibility: this one-letter
>> algorihth codes are not visible for user. And I do not think that we
>> sometime will support more than 127 (or even 64 different compression
>> algorithms).
> It's pretty darn trivial to have a variable width protocol message,
> given that we have all the tooling for that. Having a variable length
> descriptor allows us to extend the compression identifier with e.g. the
> level without needing to change the protocol. E.g. zstd:9 or
> zstd:level=9 or such.
>
> I suggest using a variable length string as the identifier, and split it
> at the first : for the lookup, and pass the rest to the compression
> method.
Yes, I agree that it provides more flexibility.
And it is not a big problem to handle arbitrary strings instead of chars.
But right now my intention was to prevent user from choosing compression
algorithm and especially specifying compression level
(which are algorithm-specific). Its matter of server-client handshake to
choose best compression algorithm supported by both of them.
I can definitely rewrite it, but IMHO given to much flexibility for user
will just complicate things without any positive effect.
>
>>> The protocol sounds to me like there's no way to enable/disable
>>> compression in an existing connection. To me it seems better to have an
>>> explicit, client initiated, request to use a specific method of
>>> compression (including none). That allows to enable compression for bulk
>>> work, and disable it in other cases (e.g. for security sensitive
>>> content, or for unlikely to compress well content).
>> It will significantly complicate implementation (because of buffering at
>> different levels).
> Really? We need to be able to ensure a message is flushed out to the
> network at precise moments - otherwise a lot of stuff will break.
Definitely stream is flushed after writing each message.
The problem is at receiver side: several messages can be sent without
waiting response and will be read into the buffer. If first is compressed
and subsequent - not, then it will be not so trivial to handle it. I
have already faced with this problem when compression is switched on:
backend may send some message right after acknowledging compression
protocol. This message is already compressed but is delivered together
with uncompressed compression acknowledgement message. I have solved
this problem in switching on compression at client side,
but really do not want to solve it at arbitrary moments. And once again:
my opinion is that too much flexibility is not so good here:
there is no sense to switch off compression for short messages (overhead
of switching can be larger than compression itself).
>
>> Also it is not clear to me who and how will control enabling/disabling
>> compression in this case?
>> I can imagine that "\copy" should trigger compression. But what about
>> (server side) "copy" command?
> The client could still trigger that. I don't think it should ever be
> server controlled.
Sorry, but I still think that possibility to turn compression on/off on
the fly is very doubtful idea.
And it will require extension of protocol (adding some extra functions
to libpq library to control it).
>> And concerning security risks... In most cases such problem is not relevant
>> at all because both client and server are located within single reliable
>> network.
> The trend is towards never trusting the network, even if internal, not
> the opposite.
>
>
>> It if security of communication really matters, you should not switch
>> compression in all cases (including COPY and other bulk data transfer).
>> It is very strange idea to let client to decide which data is "security
>> sensitive" and which not.
> Huh?
Sorry, if I was unclear. I am not specialist in security.
But it seems to be very dangerous when client (and not server) decides
which data is "security sensitive"
and which not. Just an example: you have VerySecreteTable. And when you
perform selects on this table,
you switch compression off. But then client want to execute COPY command
for this table and as far as it requires bulk data transfer,
user decides to switch on compression.
Actually specking about security risks has no sense: if you want to
provide security, then you use TLS. And it provides its own compression.
So there is no need to perform compression at libpq level. libpq
comression is for the cases when you do not need SSL at all.
>>> I think that would also make cross-version handling easier, because a
>>> newer client driver can send the compression request and handle the
>>> error, without needing to reconnect or such.
>>>
>>> Most importantly, I think such a design is basically a necessity to make
>>> connection poolers to work in a sensible way.
>> I do not completely understand the problem with connection pooler.
>> Right now developers of Yandex Odyssey are trying to support libpq
>> compression in their pooler.
>> If them will be faced with some problems, I will definitely address
>> them.
> It makes poolers a lot more expensive if they have to decompress and
> then recompress again. It'd be awesome if even the decompression could
> be avoided in at least some cases, but compression is the really
> expensive part. So if a client connects to the pooler with
> compression=zlib and an existing server connection is used, the pooler
> should be able to switch the existing connection to zlib.
My expectation was that pooler is installed in the same network as
database and there is no need
to compress traffic between pooler and backends.
But even if it is really needed I do not see problems here.
Definitely we need zpq library to support different compression
algorithms in the same application.
But it is done.
To avoid pooler perform decompression+compression ist is necessary to
send command header in raw format.
But it will greatly reduce effect of stream compression.
>
>
>> But if you think that it is so important, I will try to implement it. Many
>> questions arise in this case: which side should control compression level?
>> Should client affect compression level both at client side and at server
>> side? Or it should be possible to specify separately compression level for
>> client and for server?
> I don't think the server needs to know about what the client does,
> compression level wise.
Sorry, I do not understand you.
Compression takes place at sender side.
So both client compressing its messages before sending to server
and server compresses responses sent to the client.
In both cases compression level may be specified.
And it is not clear whether client should control compression level of
the server.
>
>>>> +<para>
>>>> + Used compression algorithm. Right now the following streaming compression algorithms are supported: 'f' - Facebook zstd, 'z' - zlib, 'n' - no compression.
>>>> +</para>
>>> I would prefer this just be referenced as zstd or zstandard, not
>>> facebook zstd. There's an RFC (albeit only "informational"), and it
>>> doesn't name facebook, except as an employer:
>>> https://tools.ietf.org/html/rfc8478
>> Please notice that it is internal encoding, user will specify
>> psql -d "dbname=postgres compression=zstd"
>>
>> If name "zstd" is not good, I can choose any other.
> All I was saying is that I think you should not name it ""Facebook zstd",
> just "zstd".
Sorry, it was my error.
Right now compression name is not specified by user at all: just boolean
value on/off.
> I think you should just use something like
>
> pq_beginmessage(buf, 'z');
> pq_sendbyte(buf, compression_method_byte);
> pq_endmessage(buf);
>
> like most of the backend does? And if you, as I suggest, use a variable
> length compression identifier, use
> pq_sendcountedtext(buf, compression_method, strlen(compression_method));
> or such.
Sorry, I can not do it in this way because pq connection with client is
not yet initialized.
So I can not use pq_endmessage here and have to call secure_write manually.
> I am not saying it's necessarily the best approach, but it doesn't seem
> like it'd be hard to have zpq not send / receive the data from the
> network, but just transform the data actually to-be-sent/received from
> the network.
>
> E.g. in pq_recvbuf() you could do something roughly like the following,
> after a successful secure_read()
>
> if (compression_enabled)
> {
> zpq_decompress(network_data, network_data_length,
> &input_processed, &output_data, &output_data_length);
> copy_data_into_recv_buffer(output_data, output_data_length);
> network_data += input_processed;
> network_data_length -= input_processed;
> }
Sorry, it is not possible because to produce some output, compressor may
need to perform multiple read.
And doing it in two different places, i.e. in pqsecure_read and
secure_read is worse than doing it on once place - s it is done now.
> and the inverse on the network receive side. The advantage would be that
> we can then more easily use zpq stuff in other places.
>
> If we do go with the current callback model, I think we should at least
> extend it so that a 'context' parameter is shuffled through zpq, so that
> other code can work without global state.
Callback is has void* arg where arbitrary daat can be passed to callback.
>
>
>>>> + r = PqStream
>>>> + ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength,
>>>> + PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed)
>>>> + : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength,
>>>> + PQ_RECV_BUFFER_SIZE - PqRecvLength);
>>>> + PqRecvLength += processed;
>>> ? : doesn't make sense to me in this case. This should be an if/else.
>>>
>> Isn't it a matter of style preference?
>> Why if/else is principle better than ?:
> ?: makes sense when it's either much shorter, or when it allows to avoid
> repeating a bunch of code. E.g. if it's just a conditional argument to a
> function with a lot of other arguments.
>
> But when, as here, it just leads to weirder formatting, it really just
> makes the code harder to read.
From my point of view the main role of ?: construction is not just
saving some bytes/lines of code.
It emphasizes that both parts of conditional construction produce the
same result.
As in this case it is result of read operation. If you replace it with
if-else, this relation between actions done in two branches is missed
(or at least less clear).
>
>> I agree that sometimes ?: leads to more complex and obscure expressions.
>> But to you think that if-else in this case will lead to more clear or
>> readable code?
> Yes, pretty clearly for me.
>
> r = PqStream
> ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength,
> PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed)
> : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength,
> PQ_RECV_BUFFER_SIZE - PqRecvLength);
> vs
>
> if (PqStream)
> r = zpq_read(PqStream, PqRecvBuffer + PqRecvLength,
> PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed);
> else
> r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength,
> PQ_RECV_BUFFER_SIZE - PqRecvLength);
>
> the if / else are visually more clearly distinct, the sole added
> repetition is r =. And most importantly if / else are more common.
I agree that ?: is less clear than if/else. In most of modern
programming languages there is no construction ?:
but if-else can be used as expression. And repetition is always bad,
isn't it? Not just because of writing extra code,
but because of possible inconsistencies and errors (just like normal
forms in databases).
In any case - I do not think that it is principle: if you prefer - I can
replace it with if-else
>> Another question is whether conditional expression here is really good idea.
>> I prefer to replace with indirect function call...
> A branch is cheaper than indirect calls, FWIW>
Really? Please notice that we do not compare just branch with indirect
call, but branch+direct call vs. indirect call.
I didn't measure it. But in any case IMHO performance aspect is less
important here than clearance and maintainability of code.
And indirect call is definitely better in this sense... Unfortunately I
can not use it (certainly it is not absolutely impossible, but it
requires much
more changes).
>>>> +#define ZSTD_BUFFER_SIZE (8*1024)
>>>> +#define ZSTD_COMPRESSION_LEVEL 1
>>> Add some arguments for choosing these parameters.
>>>
>> What are the suggested way to specify them?
>> I can not put them in GUCs (because them are also needed at client side).
> I don't mean arguments as in configurable, I mean that you should add a
> comment explaining why 8KB / level 1 was chosen.
Sorry, will do.
>>>> +/*
>>>> + * Get list of the supported algorithms.
>>>> + * Each algorithm is identified by one letter: 'f' - Facebook zstd, 'z' - zlib.
>>>> + * Algorithm identifies are appended to the provided buffer and terminated by '\0'.
>>>> + */
>>>> +void
>>>> +zpq_get_supported_algorithms(char algorithms[ZPQ_MAX_ALGORITHMS])
>>>> +{
>>>> + int i;
>>>> + for (i = 0; zpq_algorithms[i].name != NULL; i++)
>>>> + {
>>>> + Assert(i < ZPQ_MAX_ALGORITHMS);
>>>> + algorithms[i] = zpq_algorithms[i].name();
>>>> + }
>>>> + Assert(i < ZPQ_MAX_ALGORITHMS);
>>>> + algorithms[i] = '\0';
>>>> +}
>>> Uh, doesn't this bake ZPQ_MAX_ALGORITHMS into the ABI? That seems
>>> entirely unnecessary?
>> I tried to avoid use of dynamic memory allocation because zpq is used both
>> in client and server environments with different memory allocation
>> policies.
> We support palloc in both(). But even without that, you can just have
> one function to get the number of algorithms, and then have the caller
> pass in a large enough array.
Certainly it is possible.
But why complicate implementation if it is internal function used only
in single place in the code?
>
>
>> And I afraid that using the _pq_ parameter stuff makes enabling of
>> compression even less user friendly.
> I didn't intend to suggest that the _pq_ stuff should be used in a
> client facing manner. What I suggesting is that if the connection string
> contains compression=, the driver would internally translate that to
> _pq_.compression to pass that on to the server, which would allow the
> connection establishment to succeed, even if the server is a bit older.
New client can establish connection with old server if it is not using
compression.
And if it wants to use compression, then _pq_.compression will not help
much.
Old server will ignore it (unlike compression= option) but then client
will wait for acknowledgement from
server for used compression algorithm and didn't receive one. Certainly
it is possible to handle this situation (if we didn't
receive expected message) but why it is better than adding compression
option to startup package?
But if you think that adding new options to startup package should be
avoided as much as possible,
then I will replace it with _pq_.compression.
> Greetings,
>
> Andres Freund
Thank you very much for review and explanations.
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2020-10-31 20:40:12 | Re: Consistent error reporting for encryption/decryption in pgcrypto |
Previous Message | Justin Pryzby | 2020-10-31 18:36:11 | Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly |