Update concurrency

From: Sylvain Déve <sylvain(dot)deve(at)protonmail(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Update concurrency
Date: 2021-12-20 13:45:29
Message-ID: -zxRtNEMT5cJjoFNR4LphpHl907G9wjMd-xShnDUyjZg5cgXZQpRdsNHe6OY96_8eyP9s0cyttLKUwv4FyRrirbUGeEoQe22M0lpaFJSGuo=@protonmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

On a server, I have an API which is called by one or multiple clients. The server can receive, almost at the same time, multiple update instructions for the same PostgreSQL attribute. These requests do not arrive at the exact same time, and I would just like to execute them one after another. Because of the (Flask) API, I cannot control that a previous call is over, so the new update can be called before the previous one is completed. I use psycopg2 to interface with PostgreSQL, and I would except psycopg2/PostgreSQL to be able to queue the updates and proceed them quietly one after another (maybe not even in order), but it does not seem to work that way.

Below is a simplified version of the function called on the server when an update is required. The attribute to be updated is a JSONB object, if that matters. Hence there is 'path' to determine which part of the JSONB object to update. I copy the Python code, not only the PostgreSQL code, because it might be part of the problem. So the API call looks like this:

def pg_update(data, path):
conn = psycopg2.connect(...) # always the same database.
cur = conn.cursor()

# JSONB update for a single selected row and column.
# 'column' and 'select_row' are determined with the parameters 'data' and 'path'.
command = (
f"""UPDATE MY_TABLE SET """
f"""{column} = jsonb_merge({column}, %s) """
f"""WHERE {select_row};"""
)

cur.execute(command, [Json(data)])
conn.commit()

When called twice in a row, this call leads to the error (at "cur.execute"):

psycopg2.errors.InternalError_: tuple concurrently updated

Since the server keeps running all the time, I can also define conn outside the API call "pg_update" so that all calls are processed with the same psycopg2 connection. But then, when two updates are requested for the same attribute, I get instead:

psycopg2.ProgrammingError: execute cannot be used while an asynchronous query is underway

I forced async_ = False in psycopg2.connect, just in case. No change.

I also tried to lock like that:
command = (
f"""BEGIN; """
f"""SELECT * FROM MY_TABLE WHERE {select_row} FOR UPDATE; """
f"""UPDATE MY_TABLE SET """
f"""{column} = jsonb_merge({column}, %s) """
f"""WHERE {select_row};"""
f"""END;"""
)

To summarize, how can I allow for multiple updates of the same attributes, one after another, even when these updates are requested almost at the same time from independent, uncontrollable client requests? Maybe there is a need for a better locking mechanism? A queuing mechanism?

Thanks for your help!

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2021-12-20 13:52:03 Re: Update concurrency
Previous Message Pavel Stehule 2021-12-20 12:42:01 Re: How to reduce query planning time (10s)