From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: [PERFORM] Slow BLOBs restoring |
Date: | 2010-12-09 15:05:15 |
Message-ID: | 6530.1291907115@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
I wrote:
> One fairly simple, if ugly, thing we could do about this is skip calling
> reduce_dependencies during the first loop if the TOC object is a blob;
> effectively assuming that nothing could depend on a blob. But that does
> nothing about the point that we're failing to parallelize blob
> restoration. Right offhand it seems hard to do much about that without
> some changes to the archive representation of blobs. Some things that
> might be worth looking at for 9.1:
> * Add a flag to TOC objects saying "this object has no dependencies",
> to provide a generalized and principled way to skip the
> reduce_dependencies loop. This is only a good idea if pg_dump knows
> that or can cheaply determine it at dump time, but I think it can.
I had further ideas about this part of the problem. First, there's no
need for a file format change to fix this: parallel restore is already
groveling over all the dependencies in its fix_dependencies step, so it
could count them for itself easily enough. Second, the real problem
here is that reduce_dependencies processing is O(N^2) in the number of
TOC objects. Skipping it for blobs, or even for all dependency-free
objects, doesn't make that very much better: the kind of people who
really need parallel restore are still likely to bump into unreasonable
processing time. I think what we need to do is make fix_dependencies
build a reverse lookup list of all the objects dependent on each TOC
object, so that the searching behavior in reduce_dependencies can be
eliminated outright. That will take O(N) time and O(N) extra space,
which is a good tradeoff because you won't care if N is small, while if
N is large you have got to have it anyway.
Barring objections, I will do this and back-patch into 9.0. There is
maybe some case for trying to fix 8.4 as well, but since 8.4 didn't
make a separate TOC entry for each blob, it isn't as exposed to the
problem. We didn't back-patch the last round of efficiency hacks in
this area, so I'm thinking it's not necessary here either. Comments?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2010-12-09 15:50:03 | Re: Solving sudoku using SQL |
Previous Message | Tom Lane | 2010-12-09 14:50:30 | Re: [PERFORM] Slow BLOBs restoring |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-12-09 15:56:39 | Re: [PERFORM] Slow BLOBs restoring |
Previous Message | Tom Lane | 2010-12-09 14:50:30 | Re: [PERFORM] Slow BLOBs restoring |