From: | Hannu Krosing <hannu(at)skype(dot)net> |
---|---|
To: | Jan Wieck <JanWieck(at)Yahoo(dot)com> |
Cc: | PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: Snapshot cloning |
Date: | 2007-01-26 07:19:16 |
Message-ID: | 1169795956.3368.8.camel@localhost.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Ühel kenal päeval, N, 2007-01-25 kell 22:19, kirjutas Jan Wieck:
> Granted this one has a few open ends so far and I'd like to receive some
> constructive input on how to actually implement it.
>
> The idea is to clone an existing serializable transactions snapshot
> visibility information from one backend to another. The semantics would
> be like this:
>
> backend1: start transaction;
> backend1: set transaction isolation level serializable;
> backend1: select pg_backend_pid();
> backend1: select publish_snapshot(); -- will block
>
> backend2: start transaction;
> backend2: set transaction isolation level serializable;
> backend2: select clone_snapshot(<pid>); -- will unblock backend1
>
> backend1: select publish_snapshot();
>
> backend3: start transaction;
> backend3: set transaction isolation level serializable;
> backend3: select clone_snapshot(<pid>);
>
> ...
>
> This will allow a number of separate backends to assume the same MVCC
> visibility, so that they can query independent but the overall result
> will be according to one consistent snapshot of the database.
I see uses for this in implementing query parallelism in user level
code, like querying two child tables in two separate processes.
> What I try to accomplish with this is to widen a bottleneck, many
> current Slony users are facing. The initial copy of a database is
> currently limited to one single reader to copy a snapshot of the data
> provider. With the above functionality, several tables could be copied
> in parallel by different client threads, feeding separate backends on
> the receiving side at the same time.
I'm afraid that for most configurations this would make the copy slower,
as there will be mode random disk i/o.
Maybe better fix slony so that it allows initial copies in different
parallel transactions, or just do initial copy in several sets and merge
the sets later.
> The feature could also be used by a parallel version of pg_dump as well
> as data mining tools.
>
> The cloning process needs to make sure that the clone_snapshot() call is
> made from the same DB user in the same database as corresponding
> publish_snapshot() call was done.
Why ? Snapshot is universal and same for whole db instance, so why limit
it to same user/database ?
> Since publish_snapshot() only
> publishes the information, it gained legally and that is visible in the
> PGPROC shared memory (xmin, xmax being the crucial part here), there is
> no risk of creating a snapshot for which data might have been removed by
> vacuum already.
>
> What I am not sure about yet is what IPC method would best suit the
> transfer of the arbitrarily sized xip vector. Ideas?
>
>
> Jan
>
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
From | Date | Subject | |
---|---|---|---|
Next Message | Naz Gassiep | 2007-01-26 07:37:13 | Re: Proposal: Commit timestamp |
Previous Message | Jan Wieck | 2007-01-26 05:41:55 | Re: Proposal: Commit timestamp |