From: | José Luis Tallón <jltallon(at)adv-solutions(dot)net> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Fwd: [GENERAL] 4B row limit for CLOB tables |
Date: | 2015-02-02 21:50:15 |
Message-ID: | 54CFF117.9020206@adv-solutions.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On 02/02/2015 09:36 PM, Roger Pack wrote:
> On 2/2/15, José Luis Tallón <jltallon(at)adv-solutions(dot)net> wrote:
>> On 01/31/2015 12:25 AM, Jim Nasby wrote:
>>> [snip]
>>> It's a bit more complex than that. First, toast isn't limited to
>>> bytea; it holds for ALL varlena fields in a table that are allowed to
>>> store externally. Second, the limit is actually per-table: every table
>>> gets it's own toast table, and each toast table is limited to 4B
>>> unique OIDs. Third, the OID counter is actually global, but the code
>>> should handle conflicts by trying to get another OID. See
>>> toast_save_datum(), which calls GetNewOidWithIndex().
>>>
>>> Now, the reality is that GetNewOidWithIndex() is going to keep
>>> incrementing the global OID counter until it finds an OID that isn't
>>> in the toast table. That means that if you actually get anywhere close
>>> to using 4B OIDs you're going to become extremely unhappy with the
>>> performance of toasting new data.
>> Indeed ......
>>
>>> I don't think it would be horrifically hard to change the way toast
>>> OIDs are assigned (I'm thinking we'd basically switch to creating a
>>> sequence for every toast table), but I don't think anyone's ever tried
>>> to push toast hard enough to hit this kind of limit.
>> We did. The Billion Table Project, part2 (a.k.a. "when does Postgres'
>> OID allocator become a bottleneck").... The allocator becomes
>> essentially unusable at about 2.1B OIDs, where it performed very well at
>> "quite empty"(< 100M objects) levels.
>>
>> So yes, using one sequence per TOAST table should help.
>> Combined with the new SequenceAMs / sequence implementation being
>> proposed (specifically: one file for all sequences in a certain
>> tablespace) this should scale much better.
> But it wouldn't be perfect, right? I mean if you had multiple
> deletion/insertions and pass 4B then the "one sequence per TOAST
> table" would still wrap [albeit more slowly], and performance start
> degrading the same way. And there would still be the hard 4B limit.
> Perhaps the foreign key to the TOAST table could be changed from oid
> (32 bits) to something else (64 bits) [as well the sequence] so that
> it never wraps?
Hmm.... 2^32 times aprox. 2kB (as per usual heuristics, ~4 rows per heap
page) is 8796093022208 (~9e13) bytes
... which results in 8192 1GB segments :O
Looks like partitioning might be needed much sooner than that (if only
for index efficiency reasons)... unless access is purely sequential.
The problem with changing the id from 32 to 64 bits is that the storage
*for everybody else* doubles, making the implementation slower for
most.... though this might be actually not that important.
The alternative could be some "long LOB" ("HugeOBject"?) using the
equivalent to "serial8" whereas regular LOBs would use "serial4".
Anybody actually reaching this limit out there?
Regards,
/ J .L.
From | Date | Subject | |
---|---|---|---|
Next Message | William Gordon Rutherdale | 2015-02-03 00:57:14 | Re: Problem with REFERENCES on INHERITS |
Previous Message | Tom Lane | 2015-02-02 21:03:55 | Re: Fwd: [GENERAL] 4B row limit for CLOB tables |
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2015-02-03 00:57:04 | Re: Release note bloat is getting out of hand |
Previous Message | Magnus Hagander | 2015-02-02 21:28:29 | Re: File based Incremental backup v9 |