From: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: The plan for FDW-based sharding |
Date: | 2016-02-27 08:19:57 |
Message-ID: | 56D15C2D.6060604@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/27/2016 06:54 AM, Robert Haas wrote:
> On Fri, Feb 26, 2016 at 10:56 PM, Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>> We do not have formal prove that proposed XTM is "general enough" to handle
>> all possible transaction manager implementations.
>> But there are two general ways of dealing with isolation: snapshot based and
>> CSN based.
> I don't believe that for a minute. For example, consider this article:
Well, I have to agree that saying that there are just two ways of providing distributed isolation I was not right.
There is at least one more method: conservative locking. But it will cause huge number of extra network messages which has to be exchanged.
Also I mostly considered solutions compatible with PostgreSQL MVCC model.
And definitely their are other approaches. Like preserving transaction commit order (as it is done in Galera).
Some other them can be implemented with XTM (preserving commit order), some - not (2PL).
I have already noticed that XTM is not allowing to implement ANY transaction manager.
But we have considered several approaches to distributed transaction management explained in the article related with really working systems.
Some of them are real production system as SAP HANA, some are just prototypes, but working prototypes for which authors have performed
some benchmarking and comparison with other approaches. The references you have mentioned are mostly theoretical description of the problem.
Nice to know it but it is hard to build some concrete implementation based on this articles.
Briefly answering other your questions:
> For example, consider a table with a million rows spread across any number of servers.
It is sharding scenario, pg_tsdtm will work well in this case does not requiring sending a lot of extra messages.
> Now consider another workload where each transaction reads a row one
one server, reads a row on another server,
It can be solved both with pg_dtm (central arbiter) and pg_tsdtm (no arbiter),
But actually you scenarios just once again proves that there can not be just one ideal distributed TM.
> So maybe the goal for the GTM isn't to provide true serializability
across the cluster but some lesser degree of transaction isolation.
But then exactly which serialization anomalies are we trying to
prevent, and why is it OK to prevent those and not others?
Absolutely agree. There are some theoretical discussion regarding CAP and different distributed level of isolation.
But at practice people want to solve their tasks. Most of PostgeSQL used are using default isolation level: read committed although there are alot of "wonderful" anomalies with it.
Serialazable transaction in Oracle are actually violating fundamental serializability rule and still Oracle is one of ther most popular database in the world...
The was isolation bug in Postgres-XL which doesn't prevent from using it by commercial customers...
So I do not say that discussing all this theoretical questions is not need as formally proven correctness of distributed algorithm.
But I do not understand hot why it should prevent from providing extensible TM API.
Yes, we can tot do everything with it. But still we can implement many different approaches.
I think that it somehow proves that it is "general enough".
> https://en.wikipedia.org/wiki/Global_serializability
>
> I think the neutrality of that article is *very* debatable, but it
> certainly contradicts the idea that snapshots and CSNs are the only
> methods of achieving global serializability.
>
> Or consider this lecture:
>
> http://hssl.cs.jhu.edu/~randal/416/lectures.old/ln5.2.pdf
>
> That's a great introduction to the problem we're trying to solve here,
> but again, snapshots are not mentioned, and CSNs certainly aren't
> mentioned.
>
> This write-up goes further, explaining three different methods for
> ensuring global serializability, none of which mention snapshots or
> CSNs:
>
> http://heaven.eee.metu.edu.tr/~vision/LectureNotes/EE442/Ee442ch7.html
>
> Actually, I think the second approach is basically a snapshot/CSN-type
> approach, but it doesn't use that terminology and the connection to
> what you are proposing is very unclear.
>
> I think you're approaching this problem from a viewpoint that is
> entirely too focused on the code that exists in PostgreSQL today.
> Lots of people have done lots of academic research on how to solve
> this problem, and you can't possibly say that CSNs and snapshots are
> the only solution to this problem unless you haven't read any of those
> papers. The articles above aren't exceptional in mentioning neither
> of the approaches that you are advocating - they are typical of the
> literature in this area. How can it be that the only solutions to
> this problem are ones that are totally different from the approaches
> that university professors who spend time doing research on
> concurrency have spent time exploring?
>
> I think we need to back up here and examine our underlying design
> assumptions. The goal here shouldn't necessarily be to replace
> PostgreSQL's current transaction management with a distributed version
> of the same thing. We might want to do that, but I think the goal is
> or should be to provide ACID semantics in a multi-node environment,
> and specifically the I in ACID: transaction isolation. Making the
> existing transaction manager into something that can be spread across
> multiple nodes is one way of accomplishing that. Maybe the best one.
> Certainly one that's been experimented within Postgres-XC. But it is
> often the case that an algorithm that works tolerably well on a single
> machine starts performing extremely badly in a distributed
> environment, because the latency of communicating between multiple
> systems is vastly higher than the latency of communicating between
> CPUs or cores on the same system. So I don't think we should be
> assuming that's the way forward.
>
> For example, consider a table with a million rows spread across any
> number of servers. Consider also a series of update transactions each
> of which reads exactly one row and then writes that row. If we adopt
> any solution that involves a central coordinator to arbitrate commit
> ordering, this is going to require at least one and probably two
> million network round trips, one per transaction to get a snapshot and
> a second to commit. But all of this is completely unnecessary.
> Because each transaction touches only a single node, a perfect global
> transaction manager doesn't really need to do anything at all in this
> case. The existing PostreSQL mechanisms - snapshot isolation, and SSI
> if you have it turned on - will provide just as much transaction
> isolation on this workload as they would on a workload that only
> touched a single node. If we design a GTM that does two million
> network round trips in this scenario, we have just wasted two million
> network round trips.
>
> Now consider another workload where each transaction reads a row one
> one server, reads a row on another server, and then updates the second
> row. Here, the GTM has a job to do. If T1 reads R1, reads R2, writes
> R2; and T2 concurrently reads R2, reads R1, and then writes R1, it
> could happen that both transactions see the pre-update values of the
> row they read first and yet both transactions go on to commit. That's
> not equivalent to any serial history, so transaction isolation is
> broken. A GTM which aims to provide true cluster-wide serializability
> must do something to keep that from happening. If all of this were
> happening on a single node, those transactions would succeed if run at
> READ COMMITTED but SSI would roll one of them back at SERIALIZABLE.
> So maybe the goal for the GTM isn't to provide true serializability
> across the cluster but some lesser degree of transaction isolation.
> But then exactly which serialization anomalies are we trying to
> prevent, and why is it OK to prevent those and not others?
>
> I have seen zero discussion of any of this. What I think we ought to
> be doing here is describing precisely what might break, and then
> deciding which of those problems we want to fix, and then deciding how
> we can do that with the least amount of network traffic. Jumping to
> "let's make the transaction API pluggable" is presupposing the answer
> to the first two questions without any discussion, and I'm afraid that
> it's not going to lead to a very agreeable solution to the third one.
>
>> Yes, it is certainly possible to develop cluster by cloning PostgreSQL.
>> But it cause big problems both for developers, which have to permanently
>> synchronize their branch with master,
>> and, what is more important, for customers, which can not use standard
>> version of PostgreSQL.
>> It may cause problems with system certification, with running Postgres in
>> cloud,...
>> Actually the history of Postgres-XL/XC and Greenplum IMHO shows that it is
>> wrong direction.
> I think the history of Postgres-XC/XL shows that developing technology
> outside of the PostgreSQL community is a risky business. You might
> end up developing something that is not widely used or adopted, and
> the lack of community review might cause that technology to be less
> good than it would have been had it been done through the community
> process. It seems to me that installing a bunch of hooks here and then
> having you go off and develop outside the community has those same
> perils. (Of course, in that case and this one, working outside the
> community also lets you can also go faster and do things the community
> doesn't like, which are sometimes advantages.)
>
> Also, what you are proposing solves problems for you while maybe
> creating them for other people. You're saying that we should have
> hooks so that you don't have to merge with master. But that's just
> transferring the maintenance burden from you to core. Instead of you
> having to merge when things change, core has got to maintain the hooks
> as things change so that things are easy for you. If there are no
> code changes in the relevant area anyway, then merging is trivial and
> you shouldn't need to worry about it. I could submit a patch adding
> hooks to core to enable all of the things (or even just some of the
> things) that EnterpriseDB has changed in Advanced Server, and that
> patch would be rejected so fast it would make your head spin, because
> of course the core project doesn't want to be burdened with
> maintaining a whole bunch of hooks for the convenience of
> EnterpriseDB. Which is understandable. I think it's fine for you to
> ask whether PostgreSQL will accept a certain set of hooks, but we've
> all got to understand that there is a difference between what is
> convenient for us or our employers and what is actually best for the
> project. I am not under any illusions that those two things are the
> same, and while I do a lot of things that I hope will benefit my
> employer, when I am writing to this mailing list I do not do things
> unless they are in the interest of PostgreSQL. When those two things
> intersect, great; when they don't, and the work is community work,
> PostgreSQL wins. I see very clearly that what you are proposing here
> will benefit your customers, but unless it will also benefit the
> PostgreSQL community in general, it's not a good submission.
>
> But I don't really want to spend a lot of time arguing about politics
> here. The real issue is whether this is a good approach. If it is,
> then it's the right thing to do for PostgreSQL and we should commit
> it. If it's not, then we should reject it. Let's focus on the
> technical concerns I wrote about in the first part of the email rather
> than wrangling about business interests. I'm not blind to the fact
> that we work for different companies and I realize that can create
> some tension, but if we want to *have* a PostgreSQL community we've
> got to try to get past that.
>
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Álvaro Hernández Tortosa | 2016-02-27 11:58:58 | Re: The plan for FDW-based sharding |
Previous Message | Konstantin Knizhnik | 2016-02-27 07:52:12 | Re: Relation cache invalidation on replica |