From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Getting consistent snapshot in multiple backends, for parallel pg_dump |
Date: | 2009-11-07 11:05:41 |
Message-ID: | 1257591941.26768.10.camel@ebony |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, 2009-11-07 at 11:36 +0100, Heikki Linnakangas wrote:
> Me & Simon got talking about the difficulty of doing parallel pg_dump,
> where when you open multiple connections you must somehow ensure that
> all the connections use the same snapshot so that you get a consistent
> dump. We came up with a pretty simple way to do that:
>
> 1. Open N+1 the connections to the server
> 2. In one of them, grab ProcArrayLock in shared mode
> 3. In all other connections, begin a (serializable) transaction.
> 4. Release ProcArrayLock.
>
> Because we're holding the ProcArrayLock across 2-4, all the connections
> get at step 3 will get the same snapshot. That's exactly what we want
> for doing a parallel pg_dump.
>
> A difficulty with that is that we need some way to hold a lwlock until
> the client tells to release it. You can't hold a lwlock over command
> boundaries. But that's surely solvable, e.g by sleeping in the backend
> with the lock held until signaled by another backend. With a timeout to
> make sure we don't block indefinitely if the client crashes or something.
How about this
* In parent session, run
SELECT synchronize_snapshots('master_name',N);
synchronize_snapshots grabs ProcArrayLock and sets a ref count to N,
then waits until ref count is 0 before releasing ProcArrayLock. No need
to wait across a command.
* In N child sessions, begin serializable xact then run
SELECT snapshot_taken('master_name');
which decrements the ref count.
We protect ref count using a spin lock. Ref count is given a name, so
that we can tell apart concurrent requests for synchronize_snapshots()
--
Simon Riggs www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2009-11-07 12:32:52 | Uninitialized Data in WAL records generated in heap_(insert|update|delete) |
Previous Message | Heikki Linnakangas | 2009-11-07 10:36:41 | Getting consistent snapshot in multiple backends, for parallel pg_dump |