Thanks for looking into this Tom,

 

Glad to have the race condition confirmed. In my view the “real” fix here would be to offer an advisory locking function that accepts a timeout. FWIW, similar functionality on other RDBMS systems (SQL Server, MySQL, Oracle) offers this. That way, users wouldn’t have to rely in SET LOCAL lock_timeout to get this functionality. Thoughts? Is there a backlog item for this or has the idea been considered previously?

 

-Mike

 

From: Tom Lane
Sent: Saturday, November 12, 2022 11:47 AM
To: mike.adelson314@gmail.com
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17686: SELECT pg_advisory_lock(...) with low lock_timeout sometimes times out but the lock is acquired

 

PG Bug reporting form <noreply@postgresql.org> writes:

> I'm finding that with relatively small values of lock_timeout and when the

> system is under load (e.g. 8 connections acquiring concurrently), I will

> encounter a case where the query exits with state 55P03 (lock_not_available)

> and yet the lock was actually acquired (I can tell it has been acquired by

> querying pg_locks and because other connections' calls to pg_advisory_lock

> block).

 

Yeah, there's a fairly basic race condition there, which is that the

lock might get granted to us immediately after the timeout fires.

In ordinary usage this isn't very problematic because the lock would

get released anyway during the transaction abort resulting from the

timeout error.  However, when you're asking for a session-level

advisory lock, that doesn't happen.

 

I spent a little time studying whether there's a way to close the race,

but if it's possible at all it'd take major restructuring of what's

already quite complex and delicate code.  I can't get excited about

putting such effort into it, because related problems will exist

no matter what: if you get an error from the pg_advisory_lock command,

was the lock granted before that error occurred?  I don't think there's

any bulletproof way to deal with that except to check the lock status

afterwards (e.g., via pg_locks).  It's very much like the inherent issue

with errors late in COMMIT --- there's an atomic point where the commit

is logged, but the error report isn't going to be very clear about

whether we got past that, if indeed the error report gets to the client

at all.

 

                                                regards, tom lane