Re: Too many serialization errors in production

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: pavan95 <pavan(dot)postgresdba(at)gmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Too many serialization errors in production
Date: 2019-04-22 12:51:43
Message-ID: 20190422125143.GA10281@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 2019-Apr-22, pavan95 wrote:

> So you are suggesting that changing isolation level from the ODOO
> application side/ limiting a user to execute "set
> default_transaction_isolation='Repeatable Read' is a bad idea correct?

Yeah. Changing the isolation level is a one-line change in terms of
source of code, but the implications for the correctness of the
application are very serious, and might cause your data to become
invalid eventually.

I wonder if this problem is related to this bugfix, which came out of a
report from Odoo. If you're running 9.5.3 or earlier (or 9.4.8 or
earlier, or 9.3.13 or earlier), the fix might just be to update to a
newer minor.

Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Branch: master Release: REL9_6_BR [533e9c6b0] 2016-07-15 14:17:20 -0400
Branch: REL9_5_STABLE Release: REL9_5_4 [649dd1b58] 2016-07-15 14:17:20 -0400
Branch: REL9_4_STABLE Release: REL9_4_9 [166873dd0] 2016-07-15 14:17:20 -0400
Branch: REL9_3_STABLE Release: REL9_3_14 [6c243f90a] 2016-07-15 14:17:20 -0400

Avoid serializability errors when locking a tuple with a committed update

When key-share locking a tuple that has been not-key-updated, and the
update is a committed transaction, in some cases we raised
serializability errors:
ERROR: could not serialize access due to concurrent update

Because the key-share doesn't conflict with the update, the error is
unnecessary and inconsistent with the case that the update hasn't
committed yet. This causes problems for some usage patterns, even if it
can be claimed that it's sufficient to retry the aborted transaction:
given a steady stream of updating transactions and a long locking
transaction, the long transaction can be starved indefinitely despite
multiple retries.

To fix, we recognize that HeapTupleSatisfiesUpdate can return
HeapTupleUpdated when an updating transaction has committed, and that we
need to deal with that case exactly as if it were a non-committed
update: verify whether the two operations conflict, and if not, carry on
normally. If they do conflict, however, there is a difference: in the
HeapTupleBeingUpdated case we can just sleep until the concurrent
transaction is gone, while in the HeapTupleUpdated case this is not
possible and we must raise an error instead.

Per trouble report from Olivier Dony.

In addition to a couple of test cases that verify the changed behavior,
I added a test case to verify the behavior that remains unchanged,
namely that errors are raised when a update that modifies the key is
used. That must still generate serializability errors. One
pre-existing test case changes behavior; per discussion, the new
behavior is actually the desired one.

Discussion: https://www.postgresql.org/message-id/560AA479.4080807@odoo.com
https://www.postgresql.org/message-id/20151014164844.3019.25750@wrigleys.postgresql.org

Backpatch to 9.3, where the problem appeared.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message pavan95 2019-04-22 13:17:56 Re: Too many serialization errors in production
Previous Message pavan95 2019-04-22 09:50:51 Re: Too many serialization errors in production