From: | "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | RE: [HACKERS] Re: Concurrent VACUUM: first results |
Date: | 1999-11-29 00:32:56 |
Message-ID: | 001601bf3a01$47d0ae60$2801007e@cadzone.tpf.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>
> I have committed the code change to remove pg_vlock locking from VACUUM.
> It turns out the problems I was seeing initially were all due to minor
> bugs in the lock manager and vacuum itself.
>
> > 1. You can run concurrent "VACUUM" this way, but concurrent "VACUUM
> > ANALYZE" blows up. The problem seems to be that "VACUUM ANALYZE"'s
> > first move is to delete all available rows in pg_statistic.
>
> The real problem was that VACUUM ANALYZE tried to delete those rows
> *while it was outside of any transaction*. If there was a concurrent
> VACUUM inserting tuples into pg_statistic, the new VACUUM would end up
> calling XactLockTableWait() with an invalid XID, which caused a failure
Hmm,what I could have seen here was always LockRelation(..,RowExclu
siveLock). But the cause may be same.
We couldn't get xids of not running *transaction*s because its proc->xid
is set to 0(InvalidTransactionId). So blocking transaction couldn' find an
xidLookupEnt in xidTable corresponding to the not running *transaction*
when it tries to LockResolveConflicts() in LockReleaseAll() and couldn't
GrantLock() to XidLookupEnt corresponding to the not running *transac
tion*. After all LockAcquire() from not running *transaction* always fails
once it is blocked.
> I have fixed the simpler aspects of the problem by adding missing
> SpinRelease() calls to lock.c, making lmgr.c test for failure, and
> altering VACUUM to not do the bogus row deletion. But I suspect that
> there is more to this that I don't understand. Why does calling
> XactLockTableWait() with an already-committed XID cause the following
It's seems strange. Isn't it waiting for a being deleted tuple by vc_upd
stats() in vc_vacone() ?
> code in lock.c to trigger? Is this evidence of a logic bug in lock.c,
> or at least of inadequate checks for bogus input?
>
> /*
> * Check the xid entry status, in case something in the ipc
> * communication doesn't work correctly.
> */
> if (!((result->nHolding > 0) && (result->holders[lockmode] > 0)))
> {
> XID_PRINT_AUX("LockAcquire: INCONSISTENT ", result);
> LOCK_PRINT_AUX("LockAcquire: INCONSISTENT ", lock, lockmode);
> /* Should we retry ? */
> SpinRelease(masterLock); <<<<<<<<<<<< just added by me
> return FALSE;
> }
>
This is the third time I came here and it was always caused by
other bugs.
Regards,
Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp
From | Date | Subject | |
---|---|---|---|
Next Message | Vince Vielhaber | 1999-11-29 03:05:28 | Re: BOUNCE pgsql-ports@postgreSQL.org: Non-member submission from [Joe Brenner <doom@kzsu.stanford.edu>] (fwd) |
Previous Message | Tom Lane | 1999-11-28 23:30:23 | Re: [HACKERS] How to get OID from INSERT in PL/PGSQL? |