Re: libpq compression

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Denis Smirnov <sd(at)arenadata(dot)io>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: libpq compression
Date: 2020-12-22 18:53:17
Message-ID: 579f308e-c46c-2e79-3f90-9846450a3d71@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/22/20 7:31 PM, Andrey Borodin wrote:
>
>
>> 22 дек. 2020 г., в 23:15, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> написал(а):
>>
>>
>>
>> On 12/22/20 6:56 PM, Robert Haas wrote:
>>> On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov
>>> <usernamedt(at)yandex-team(dot)ru> wrote:
>>>> When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example, here is the Postgresql application memory usage:
>>>>
>>>> No compression - 1.2 GiB
>>>>
>>>> ZSTD
>>>> zstd:1 - 1.4 GiB
>>>> zstd:7 - 4.0 GiB
>>>> zstd:13 - 17.7 GiB
>>>> zstd:19 - 56.3 GiB
>>>> zstd:20 - 109.8 GiB - did not succeed
>>>> zstd:21, zstd:22 > 140 GiB
>>>> Postgres process crashes (out of memory)
>>> Good grief. So, suppose we add compression and support zstd. Then, can
>>> unprivileged user capable of connecting to the database can negotiate
>>> for zstd level 1 and then choose to actually send data compressed at
>>> zstd level 22, crashing the server if it doesn't have a crapton of
>>> memory? Honestly, I wouldn't blame somebody for filing a CVE if we
>>> allowed that sort of thing to happen. I'm not sure what the solution
>>> is, but we can't leave a way for a malicious client to consume 140GB
>>> of memory on the server *per connection*. I assumed decompression
>>> memory was going to measured in kB or MB, not GB. Honestly, even at
>>> say L7, if you've got max_connections=100 and a user who wants to make
>>> trouble, you have a really big problem.
>>> Perhaps I'm being too pessimistic here, but man that's a lot of memory.
>>
>> Maybe I'm just confused, but my assumption was this means there's a memory leak somewhere - that we're not resetting/freeing some piece of memory, or so. Why would zstd need so much memory? It seems like a pretty serious disadvantage, so how could it become so popular?
>
> AFAIK it's 700 clients. Does not seem like super high price for big traffic\latency reduction.
>

I don't see aby benchmark results in this thread, allowing me to make
that conclusion, and I find it hard to believe that 200MB/client is a
sensible trade-off.

It assumes you have that much memory, and it may allow easy DoS attack
(although maybe it's not worse than e.g. generating a lot of I/O or
running expensive function). Maybe allowing limiting the compression
level / decompression buffer size in postgresql.conf would be enough. Or
maybe allow disabling such compression algorithms altogether.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-12-22 19:03:05 Re: libpq compression
Previous Message Andrey Borodin 2020-12-22 18:31:37 Re: libpq compression