From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PERFORM] Slow BLOBs restoring |
Date: | 2010-12-09 13:05:33 |
Message-ID: | AANLkTinvRa_XM88FV8vJjjqkRONeNiQiY=Lxg5iARqtU@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On Thu, Dec 9, 2010 at 12:28 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
>> 08.12.2010 22:46, Tom Lane writes:
>>> Are you by any chance restoring from an 8.3 or older pg_dump file made
>>> on Windows? If so, it's a known issue.
>
>> No, I tried Linux only.
>
> OK, then it's not the missing-data-offsets issue.
>
>> I think you can reproduce it. First I created a database full of many
>> BLOBs on Postres 8.4.5. Then I created a dump:
>
> Oh, you should have said how many was "many". I had tried with several
> thousand large blobs yesterday and didn't see any problem. However,
> with several hundred thousand small blobs, indeed it gets pretty slow
> as soon as you use -j.
>
> oprofile shows all the time is going into reduce_dependencies during the
> first loop in restore_toc_entries_parallel (ie, before we've actually
> started doing anything in parallel). The reason is that for each blob,
> we're iterating through all of the several hundred thousand TOC entries,
> uselessly looking for anything that depends on the blob. And to add
> insult to injury, because the blobs are all marked as SECTION_PRE_DATA,
> we don't get to parallelize at all. I think we won't get to parallelize
> the blob data restoration either, since all the blob data is hidden in a
> single TOC entry :-(
>
> So the short answer is "don't bother to use -j in a mostly-blobs restore,
> becausw it isn't going to help you in 9.0".
>
> One fairly simple, if ugly, thing we could do about this is skip calling
> reduce_dependencies during the first loop if the TOC object is a blob;
> effectively assuming that nothing could depend on a blob. But that does
> nothing about the point that we're failing to parallelize blob
> restoration. Right offhand it seems hard to do much about that without
> some changes to the archive representation of blobs. Some things that
> might be worth looking at for 9.1:
>
> * Add a flag to TOC objects saying "this object has no dependencies",
> to provide a generalized and principled way to skip the
> reduce_dependencies loop. This is only a good idea if pg_dump knows
> that or can cheaply determine it at dump time, but I think it can.
>
> * Mark BLOB TOC entries as SECTION_DATA, or somehow otherwise make them
> parallelizable. Also break the BLOBS data item apart into an item per
> BLOB, so that that part's parallelizable. Maybe we should combine the
> metadata and data for each blob into one TOC item --- if we don't, it
> seems like we need a dependency, which will put us back behind the
> eight-ball. I think the reason it's like this is we didn't originally
> have a separate TOC item per blob; but now that we added that to support
> per-blob ACL data, the monolithic BLOBS item seems pretty pointless.
> (Another thing that would have to be looked at here is the dependency
> between a BLOB and any BLOB COMMENT for it.)
>
> Thoughts?
Is there any use case for restoring a BLOB but not the BLOB COMMENT or
BLOB ACLs? Can we just smush everything together into one section?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2010-12-09 13:13:02 | PS display and standby query conflict |
Previous Message | Simon Riggs | 2010-12-09 11:12:15 | Re: Hot Standby tuning for btree_xlog_vacuum() |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-12-09 14:50:30 | Re: [PERFORM] Slow BLOBs restoring |
Previous Message | Marti Raudsepp | 2010-12-09 12:09:28 | Re: Hardware recommendations |