From: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_dump / copy bugs with "big lines" ? |
Date: | 2015-04-08 05:06:42 |
Message-ID: | 5524B762.5060407@BlueTreble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 4/7/15 10:29 PM, Michael Paquier wrote:
> On Wed, Apr 8, 2015 at 11:53 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>>> In any case, I don't think it would be terribly difficult to allow a bit
>>> more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's
>>> some 1GB limits there too.
>>
>> The point is, those limits are there on purpose. Changing things
>> arbitrarily wouldn't be hard, but doing it in a principled way is
>> likely to require some thought. For example, in the COPY OUT case,
>> presumably what's happening is that we palloc a chunk for each
>> individual datum, and then palloc a buffer for the whole row. Now, we
>> could let the whole-row buffer be bigger, but maybe it would be better
>> not to copy all of the (possibly very large) values for the individual
>> columns over into a row buffer before sending it. Some refactoring
>> that avoids the need for a potentially massive (1.6TB?) whole-row
>> buffer would be better than just deciding to allow it.
>
> I think that something to be aware of is that this is as well going to
> require some rethinking of the existing libpq functions that are here
> to fetch a row during COPY with PQgetCopyData, to make them able to
> fetch chunks of data from one row.
The discussion about upping the StringInfo limit was for cases where a
change in encoding blows up because it's now larger. My impression was
that these cases don't expand by a lot, so we wouldn't be significantly
expanding StringInfo.
I agree that buffering 1.6TB of data would be patently absurd. Handling
the case of COPYing a row that's >1GB clearly needs work than just
bumping up some size limits. That's why I was wondering whether this was
a real scenario or just hypothetical... I'd be surprised if someone
would be happy with the performance of 1GB tuples, let alone even larger
than that.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2015-04-08 05:09:14 | Re: Re: File count restriction of directory limits number of relations inside a database. |
Previous Message | Michael Paquier | 2015-04-08 04:59:46 | Re: Replication identifiers, take 4 |