From: | Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Disallow cancellation of waiting for synchronous replication |
Date: | 2020-02-20 15:51:09 |
Message-ID: | CANtu0ogbu+y6Py963p-zKJ535b8zm5AOq7zkX7wW-tryPYi1DA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello.
Just want to share some thoughts about how it looks from perspective
of a high availability web-service application developer.
Because sometimes things look different from other sides. And
everything looks like disaster to be honest.
But let's take it one at a time.
First - the problem is not related to upsert queries only. It could be
reproduced by plain INSERTS or UPDATES. For example:
* client 1 inserts new records and waits for synchronous replication
* client 1 cancels the query
* clients 2, 3, 4 and 5 see new data and perform some actions outside
of the database in external systems
* master is switched to the replica with no WAL of new records replicated yet
As a result: newly inserted data are just gone, but external systems
already rely on it.
And this is just a huge pain for the application and its developer.
Second - it is all not about the client who canceled the query. It may
be super clever and totally understand all of the tricky aspects and
risks of such action.
But it is about *other* clients who become able to see the
"non-existing" data. They even have no option to detect such
situation.
Yes, currently there are a few ways for non-synchronous-replicated
data to become visible (for complex reasons of course):
1) client cancels the query while waiting synchronous replications
2) database restart
3) kill -9 of backend waiting synchronous replications
What is the main difference among 1 vs 2 and 3? Because 1 is performed
not only by humans.
And moreover it is performed mostly by applications. And it happens
right now on thousands on servers!
Check [1] and [2]. It is official JDBC driver for PostgreSQL. I am
sure it is the most popular way to communicate with PostgreSQL these
days.
And implementation of Statement::setQueryTimeout creates timer to send
cancellation after the timeout. It is official and recommended way to
limit statement execution to some interval in JDBC.
In my project (and its libraries) it is used in dozens of places. It
is also possible to search GitHub [4] to understand how widely it's
used.
For example it is used by Spring framework[5], probably the most
popular framework in the world for the rank-2 programming language.
And situation is even worse. What is the case when setQueryTimeout
starts to cancel queries during synchronous replication like crazy?
Yes, it is the moment of losing connection between master and sync
replica (because all backends are now stuck in synrep). New master
will be elected in a few seconds.... (or maybe already elected and
working).
And... Totally correct code cancels hundreds of queries stuck in
synrep making "non-existing" data available to be read for other
clients in same availability zone....
It is just nightmare to be honest.
These days almost every web-service needs HA for postgres. And
practically if your code (or some of library code) calls
Statement::setQueryTimeout - your HA (for example - Patroni) is
broken.
And it is really not easy to control setQueryTimeout call in modern
application with thousands of third-party libraries. Also, a lot of
applications are in the support phase.
As for me - I am going to hack postgres jdbc driver to ignore
setQueryTimeout at all for now.
>> I think proper solution here would be to add GUC to disallow cancellation of synchronous replication.
> This sounds entirely insane to me. There is no possibility that you
> can prevent a failure from occurring at this step.
Yes, maybe it is insane but looks like whole java-postgres-HA (and may
be others) world is going down. So, I believe we should even backport
such an insane knob.
As developers of distributed systems we don't have many things to rely
on. And they are:
1) if database clearly says something is committed - it is committed
with ACID guarantees
2) anything else - it may be committed, may be not committed, may be
waiting to be committed
And we've just lost letter D from ACID practically.
Thanks,
Michail.
[1] https://github.com/pgjdbc/pgjdbc/blob/23cce8ad35d9af6e2a1cb97fac69fdc0a7f94b42/pgjdbc/src/main/java/org/postgresql/core/QueryExecutorBase.java#L164-L200
[2] https://github.com/pgjdbc/pgjdbc/blob/ed09fd1165f046ae956bf21b6c7882f1267fb8d7/pgjdbc/src/main/java/org/postgresql/jdbc/PgStatement.java#L538-L540
[3] https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setQueryTimeout(int)
[4] https://github.com/search?l=Java&q=statement.setQueryTimeout&type=Code
[5] https://github.com/spring-projects/spring-framework/blob/master/spring-jdbc/src/main/java/org/springframework/jdbc/datasource/DataSourceUtils.java#L329-L343
From | Date | Subject | |
---|---|---|---|
Next Message | Kirill Bychik | 2020-02-20 15:56:27 | Re: WAL usage calculation patch |
Previous Message | Anastasia Lubennikova | 2020-02-20 15:38:22 | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |