From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> |
Cc: | Anna Akenteva <a(dot)akenteva(at)postgrespro(dot)ru>, pgsql-docs(at)postgresql(dot)org |
Subject: | Re: [DOCS] Let's document a bytea bug |
Date: | 2020-08-03 20:44:38 |
Message-ID: | 8ba460ba-9947-73e0-6920-a8a2bca436b9@joeconway.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
On 8/3/20 4:20 PM, Bruce Momjian wrote:
> On Fri, Jul 31, 2020 at 10:13:48AM +0500, Andrey M. Borodin wrote:
>> Hi Anna!
>>
>> > 23 мая 2018 г., в 20:33, Anna Akenteva <a(dot)akenteva(at)postgrespro(dot)ru> написал(а):
>> >
>> >
>> > Some time ago I've encountered a problem with the bytea type: we can't SELECT
>> > bytea strings whose textual representation is too big to fit into StringInfoData.
>> > And as a side effect, pg_dump refuses to dump tables with big bytea strings.
>> >
>> > It's a bug, it's pretty confusing, but it seems like there's no pretty way
>> > to fix it so far. Here's a link to a recent discussion on the issue:
>> > https://www.postgresql.org/message-id/flat/c8bdf802d41ec37003ec3b726db79428(at)postgrespro(dot)ru#c8bdf802d41ec37003ec3b726db79428@postgrespro.ru
>> >
>> > Since it won't be fixed anytime soon, I thought it could be worth documenting.
>> > Attaching a patch for the documentation: I added some text to the "Binary Data Types"
>> > part where I tried to describe the issue and to explain how to deal with it.
>> >
>> > My patch in plain text (for convenience):
>> >
>> > It is not recommended to use bytea strings whose textual representation
>> > exceeds 1GB, as it may not be possible to SELECT them due to output size
>> > limitations. Consequently, a table containing such big strings cannot be
>> > properly processed by pg_dump, as pg_dump will try to SELECT these values from the
>> > table and fail. The exact size limit advised for bytea strings depends on their
>> > content, the external format and encoding that you are using, the context in
>> > which they will be selected. The general rule is that when you use SELECT,
>> > the returned tuple should not exceed 1GB. Although even if SELECT does not
>> > work, you can still retrieve big bytea strings using COPY in binary format.
>>
>> Thanks for this message. It took me a while to find out what was the problem.
>> +1 for documenting this, maybe even with exact error like
>> [ 2020-07-30 01:20:32.248 MSK pg_dump - 10.3.3.30,XX000 ]:ERROR: invalid memory alloc request size 1472599557
>> It's really really scary. My first feeling was that it's TOAST corruption.
>
> I still have Anna Akenteva's 2018 email in my mailbox because I wanted
> to research this further. Now that you have re-reported the problem, I
> am on it! ;-)
>
> I looked for a clearer reproducible test case, and I have found this
> one:
It is easier to reproduce than that:
select repeat('x',600000000)::bytea;
ERROR: invalid memory alloc request size 1200000003
select octet_length(repeat('x',600000000)::bytea);
octet_length
--------------
600000000
(1 row)
CREATE TABLE big_data AS
select repeat('x',600000000)::bytea;
SELECT 1
SELECT repeat FROM big_data;
ERROR: invalid memory alloc request size 1200000003
\q
pg_dump -d postgres > /tmp/a
pg_dump: error: Dumping the contents of table "big_data" failed: PQgetResult()
failed.
pg_dump: error: Error message from server: ERROR: invalid memory alloc request
size 1200000003
pg_dump: error: The command was: COPY public.big_data (repeat) TO stdout;
There are other cases that also hit the StringInfo limits (i.e. MaxAllocSize)
with TEXT data type (and I guess others, like JSON or XML, too):
select repeat('x',600000000), repeat('y',600000000);
ERROR: out of memory
DETAIL: Cannot enlarge string buffer containing 600000010 bytes by 600000000
more bytes.
Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2020-08-03 21:04:13 | Re: [DOCS] Let's document a bytea bug |
Previous Message | Bruce Momjian | 2020-08-03 20:20:42 | Re: [DOCS] Let's document a bytea bug |