From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: We're leaking predicate locks in HEAD |
Date: | 2019-05-08 04:50:02 |
Message-ID: | CA+hUKGJ=yLV+bCYZ6QNG4vS2kCk7WrzLhBiNu-RzR3WePDxqFw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 8, 2019 at 3:53 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > Reproduced here. Once the system reaches a state where it's leaking
> > (which happens only occasionally for me during installcheck-parallel),
> > it keeps leaking for future SSI transactions. The cause is
> > SxactGlobalXmin getting stuck. The attached fixes it for me. I can't
> > remember why on earth I made that change, but it is quite clearly
> > wrong: you have to check every transaction, or you might never advance
> > SxactGlobalXmin.
>
> Hm. So I don't have any opinion about whether this is a correct fix for
> the leak, but I am quite distressed that the system failed to notice that
> it was leaking predicate locks. Shouldn't there be the same sort of
> leak-detection infrastructure that we have for most types of resources?
Well, it is hooked up the usual release machinery, because it's in
ReleasePredicateLocks(), which is wired into the
RESOURCE_RELEASE_LOCKS phase of resowner.c. The thing is that lock
lifetime is linked to the last transaction with the oldest known xmin,
not the transaction that created them.
More analysis: Lock clean-up is deferred until "... the last
serializable transaction with the oldest xmin among serializable
transactions completes", but I broke that by excluding read-only
transactions from the check so that SxactGlobalXminCount gets out of
sync. There's a read-only SSI transaction in
src/test/regress/sql/transactions.sql, but I think the reason the
problem manifests only intermittently with installcheck-parallel is
because sometimes the read-only optimisation kicks in (effectively
dropping us to plain old SI because there's no concurrent serializable
activity) and it doesn't take any locks at all, and sometimes the
read-only transaction doesn't have the oldest known xmin among
serializable transactions. However, if a read-write SSI transaction
had already taken a snapshot and has the oldest xmin and then the
read-only one starts with the same xmin, we get into trouble. When
the read-only one releases, we fail to decrement SxactGlobalXminCount,
and then we'll never call ClearOldPredicateLocks().
--
Thomas Munro
https://enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Ideriha, Takeshi | 2019-05-08 05:29:30 | RE: Copy data to DSA area |
Previous Message | Kyotaro HORIGUCHI | 2019-05-08 04:09:23 | Re: Statistical aggregate functions are not working with PARTIAL aggregation |