Quick Links

Re: parallel pg_restore design issues

From:	Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: parallel pg_restore design issues
Date:	2008-10-06 03:04:09
Message-ID:	48E98029.3040108@rhyme.com.au
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Another 0.02c, bringing the grand total to 0.04c.

Andrew Dunstan wrote:
> First, we need a way to decide the boundary between the serially run
> "pre-data" section and the remainder of the items in the TOC.
> Currently the code uses the first TABLEDATA item as the boundary.
> That's not terribly robust (what if there aren't any?).

Drawing on your later comments, and bearing in mind that it is always
recommended to use the *new* pg_dump to upgrade, I'd suggest adding some
flags to the header of the dump that say 'contains data' (might already
be there), then adding flags to TOC entries that indicate useful
properties. ISTM there are effectively 5 types of TOC entry which can be
characterised by: table definitions (including schema -- anything
defining the structure of the db), data load, performance items (index
definitions), integrity items (constraints and triggers) and function
definitions (usually part of an application interface or triggers).

So, based on the current needs, adding:

- isData
- isStructural (must be a better name) -- for anything that is
required *before* data can be loaded.
- isPerformance (likely to improve database performance after theyy
are applied)

would seem to satisfy your need.

I'm making this up as I go along, so adding any other set of flags that
abstracts the questions you are asking in a generic and useful way would
be fine. I'm pretty sure we have places already in pg_dump that could
use such flags.

> Also, people have wanted to steer clear of hardcoding much knowledge
> of archive member types into pg_restore as a way of future-proofing it
> somewhat. I'm wondering if we should have pg_dump explicitly mark
> items as pre-data,data or post-data. For legacy archives we could
> still check for either a TABLEDATA item or something known to sort
> after those (i.e. a BLOB, BLOB COMMENT, CONSTRAINT, INDEX, RULE,
> TRIGGER or FK CONSTRAINT item).

This is, at least to some extent, answered above. Anything that is not
isData or isStructural is post data.

> Another item we have already discussed is how to prevent concurrent
> processes from trying to take conflicting locks. Her we really can't
> rely on pg_dump to help us out, as lock requirements might change (a
> little bird has already whispered in my ear about reducing the
> strength of FK CONSTRAINT locks taken). I haven't got a really good
> answer here.
Several things occurred to me here.

- you need to avoid shooting yourself in the foot by, for example,
trying to define an FK before relevant indexes are defined. This is
probably one area where checking the TOC entry type becomes necessary,
unless we create a flag 'isPerformance' flag (see above) in which case
you load all isPerformance entries immediately after data is loaded.
AFAICT, isPerformance becomes a synonym for entry_type='INDEX' (not
sure), but adding the flag (required for all toc entries) will aid in
future-proofing in ways that checking entry type values do not.

- It would be interesting to see benchmarks of pg_restore running with
*anything* that shared dependencies disallowed from running concurrently
vs. anything that's not an index vs. anything thats not 'isPerformance'.

- allowing lock failures might help; ie. if a statement produces a lock
failure, just wait a bit and put it back at the end of the queue. If it
fails a second time, mark it as 'single-thread-only'. Would be
interesting to see if this imroved the pre-data load, or made it slower.

- Ultimately, I suspect trying to second-guess backend locking will be
a hard-fought battle that will not be worth the resources expended.
Getting 95% of the way there will be good enough so long as lock
conflicts do not crash pg_restore (see previous point).

> Another possible algorithm would reorder the queue by elevating any
> item whose dependencies have been met.

Bad idea; FKs benefit from indexes.

I think, based on the benefit provided by 'isPerformance' items (by
definition), you probably need to introduce a choke point to ensure all
isPerformance items have been executed before moving on to other items.
The locking strategy above may effectively do that (shared dependencies
will be disallowed unless both are 'isPerformance').

> This will mean all the indexes for a table will tend to be grouped
> together, which might well be a good thing, and will tend to limit the
> tendency to do all the data loading at once.

Or it might be a bad thing. Not sure why you can't intermix data and
schema items ('isStructure') tho.

And, I'm curious: what is wrong with loading all the data at once? That
will be the result I suspect for simple queueing reasons: data takes a
long time, eventually without a scheduler and explicit thread limits
(see below), all threads will be loading data.

> Both of these could be modified by explicitly limiting TABLEDATA items
> to a certain proportion (say, one quarter) of the processing slots
> available, if other items are available.

Yep. But is there a benefit?

I have not played with this patch, but does it dump stats of thread
usage (stalled, executing etc)?

I think there could be some benefit in reordering stuff, but it is not
clear to me what the best ordering would be.

> I'm actually somewhat inclined to make provision for all of these
> possibilities via a command line option, with the first being the
> default. One size doesn't fit all, I suspect, and if it does we'll
> need lots of data before deciding what that size is. The extra logic
> won't really involve all that much code, and it will all be confined
> to a couple of functions.

Keeping it encapsulated with the default being a NOP sounds good to me.
Random might even be useful.

In response to

parallel pg_restore design issues at 2008-10-06 00:11:48 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2008-10-06 07:00:31	Re: Subtransaction commits and Hot Standby
Previous Message	Greg Smith	2008-10-06 01:50:40	Re: Add default_val to pg_settings