From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com> |
Cc: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>,"Robert Haas" <robertmhaas(at)gmail(dot)com>,"Jim Nasby" <Jim(dot)Nasby(at)bluetreble(dot)com>,"Ronan Dunklau" <ronan(dot)dunklau(at)dalibo(dot)com>,"pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_dump / copy bugs with "big lines" ? |
Date: | 2016-03-23 17:14:16 |
Message-ID: | d3fe524a-1c78-4cb3-9814-849cd4f43fe6@mm |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alvaro Herrera wrote:
> > tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
> >
> > which fails because (HEAPTUPLESIZE + len) is again considered
> > an invalid size, the size being 1468006476 in my test.
>
> Um, it seems reasonable to make this one be a huge-zero-alloc:
>
> MemoryContextAllocExtended(CurrentMemoryContext,
> HEAPTUPLESIZE + len,
> MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO)
Good, this allows the tests to go to completion! The tests in question
are dump/reload of a row with several fields totalling 1.4GB (deflated),
with COPY TO/FROM file and psql's \copy in both directions, as well as
pg_dump followed by pg_restore|psql.
The modified patch is attached.
It provides a useful mitigation to dump/reload databases having
rows in the 1GB-2GB range, but it works under these limitations:
- no single field has a text representation exceeding 1GB.
- no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we
could push this to 4GB with limited changes to libpq, by
interpreting the Int32 field in the CopyData message as unsigned).
It's also possible to go beyond 4GB per row with this patch, but
when not using the protocol. I've managed to get a 5.6GB single-row
file with COPY TO file. That doesn't help with pg_dump, but that might
be useful in other situations.
In StringInfo, I've changed int64 to Size, because on 32 bits platforms
the downcast from int64 to Size is problematic, and as the rest of the
allocation routines seems to favor Size, it seems more consistent
anyway.
I couldn't test on 32 bits though, as I seem to never have enough
free contiguous memory available on a 32 bits VM to handle
that kind of data.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
Attachment | Content-Type | Size |
---|---|---|
huge-stringinfo-v2.diff | text/x-patch | 6.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2016-03-23 17:20:04 | Re: Breakage with VACUUM ANALYSE + partitions |
Previous Message | Robert Haas | 2016-03-23 17:11:23 | Re: [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator |