From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Vinayak Pokale <pokale_vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
Subject: | Re: Transactions involving multiple postgres foreign servers |
Date: | 2017-08-01 08:56:09 |
Message-ID: | CAD21AoDbmAQf=CfLLg9Peuezm7_-3iwrMS2J=Zq38udGCVBmOw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 1, 2017 at 3:43 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Jul 31, 2017 at 1:27 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
>> Postgres-XL seems to manage this problem by using a transaction manager
>> node, which is in charge of assigning snapshots. I don't know how that
>> works, but perhaps adding that concept here could be useful too. One
>> critical point to that design is that the app connects not directly to
>> the underlying Postgres server but instead to some other node which is
>> or connects to the node that manages the snapshots.
>>
>> Maybe Michael can explain in better detail how it works, and/or how (and
>> if) it could be applied here.
>
> I suspect that if you've got a central coordinator server that is the
> jumping-off point for all distributed transactions, the Postgres-XL
> approach is hard to beat (at least in concept, not sure about the
> implementation). That server is brokering all of the connections to
> the data nodes anyway, so it might as well tell them all what
> snapshots to use while it's there. When you scale to multiple
> coordinators, though, it's less clear that it's the best approach.
> Now one coordinator has to be the GTM master, and that server is
> likely to become a bottleneck -- plus talking to it involves extra
> network hops for all the other coordinators. When you then move the
> needle a bit further and imagine a system where the idea of a
> coordinator doesn't even exist, and you've just got a loosely couple
> distributed system where distributed transactions might arrive on any
> node, all of which are also servicing local transactions, then it
> seems pretty likely that the Postgres-XL approach is not the best fit.
>
> We might want to support multiple models. Which one to support first
> is a harder question. The thing I like least about the Postgres-XC
> approach is it seems inevitable that, as Michael says, the central
> server handing out XIDs and snapshots is bound to become a bottleneck.
> That type of system implicitly constructs a total order of all
> distributed transactions, but we don't really need a total order. If
> two transactions don't touch the same data and there's no overlapping
> transaction that can notice the commit order, then we could make those
> commit decisions independently on different nodes without caring which
> one "happens first". The problem is that it might take so much
> bookkeeping to figure out whether that is in fact the case in a
> particular instance that it's even more expensive than having a
> central server that functions as a global bottleneck.
>
> It might be worth some study not only of Postgres-XL but also of other
> databases that claim to provide distributed transactional consistency
> across nodes. I've found literature on this topic from time to time
> over the years, but I'm not sure what the best practices in this area
> actually are.
Yeah it's worth to study other databases and to consider the approach
that goes well with the PostgreSQL architecture. I've read some papers
related to distributed transaction management but I'm also not sure
what the best practices in this area are. However, one trend I've seen
is that some cloud-native databases such as Google Spanner[1] and
Cockroachdb employs the tecniques using timestamps to determine the
visibility without centralized coordination. Google Spanner uses GPS
clocks and atomic clocks but since these are not common hardware
Cockroachdb uses local timestamps with NTP instead. Also, other
transaction techniques using local timestamp have been discussed. For
example Clock-SI[2] derives snapshots and commit timestamps from
loosely synchronized physical clocks, though it doesn't support
serializable isolation level. IIUC postgrespro multi-master cluster
employs the technique based on that. I've not read deeply yet but I
found new paper[3] on last week which introduces new SI mechanism that
allows transactions to determine their timestamps autonomously,
without relying on centralized coordination. PostgreSQL uses XID to
determine visibility now but mapping XID to its timestamp using commit
timestmap feature might be able to allow PostgreSQL to use the
timestamp for that purpose.
> https://en.wikipedia.org/wiki/Global_serializability
> claims that a technique called Commitment Ordering (CO) is teh
> awesome, but I've got my doubts about whether that's really an
> objective description of the state of the art. One clue is that the
> global serialiazability article says three separate times that the
> technique has been widely misunderstood. I'm not sure exactly which
> Wikipedia guideline that violates, but I think Wikipedia is supposed
> to summarize the views that exist on a topic in accordance with their
> prevalence, not take a position on which view is correct.
> https://en.wikipedia.org/wiki/Commitment_ordering contains citations
> from the papers only of one guy, Yoav Raz, which is another hint that
> this may not be as widely-regarded a technique as the person who wrote
> these articles thinks it should be. Anyway, it would be good to
> understand what other well-regarded systems do before we choose what
> we want to do.
[1] https://research.google.com/archive/spanner.html
[2] https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/samehe-clocksi.srds2013.pdf
[3] https://arxiv.org/pdf/1704.01355.pdf
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2017-08-01 09:11:44 | Re: foreign table creation and NOT VALID check constraints |
Previous Message | Simon Riggs | 2017-08-01 08:54:25 | Re: foreign table creation and NOT VALID check constraints |