From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Justin Pasher <justinp(at)newmediagateway(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Autovacuum daemon terminated by signal 11 |
Date: | 2009-01-16 23:43:09 |
Message-ID: | 15221.1232149389@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
I wrote:
> ... and you've seemingly not managed to install the debug symbols where
> gdb can find them.
But never mind that --- it turns out to be trivial to reproduce the
crash. Just create a database, set its datfrozenxid and datvacuumxid
far in the past (via a manual update of pg_database), enable autovacuum,
and wait a bit.
What is happening is that autovacuum_do_vac_analyze contains
old_cxt = MemoryContextSwitchTo(AutovacMemCxt);
...
vacuum(vacstmt, relids);
...
MemoryContextSwitchTo(old_cxt);
and at the time it is called by process_whole_db, CurrentMemoryContext
points at TopTransactionContext. Which gets destroyed because vacuum()
internally finishes that transaction and starts a new one. When we
come out of vacuum(), CurrentMemoryContext again points at
TopTransactionContext, but *its not the same one*. The closing
MemoryContextSwitchTo is installing a stale pointer, which then remains
active into CommitTransaction. It's a wonder this code ever works.
The other path through do_autovacuum() escapes this fate because it
enters autovacuum_do_vac_analyze with CurrentMemoryContext pointing
at AutovacMemCxt, which isn't going to go away.
I argue that autovacuum_do_vac_analyze shouldn't attempt to restore the
caller's memory context at all. One possible approach is to make it
re-select AutovacMemCxt at exit, but I wonder if we shouldn't define
its entry and exit conditions as current context being
(the current instance of) TopTransactionContext.
It looks like 8.3 and HEAD take the latter approach and are therefore
safe from this bug. 8.2 seems to escape it also because it doesn't have
process_whole_db anymore, but it's certainly not
autovacuum_do_vac_analyze fault that it's not broken, because it's still
trying to restore a context that it has no right to assume still exists.
Alvaro, you want to take charge of fixing this?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Erik Jones | 2009-01-16 23:58:20 | Re: Inheritance question |
Previous Message | Justin Pasher | 2009-01-16 23:29:35 | Re: Autovacuum daemon terminated by signal 11 |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2009-01-17 00:01:15 | Re: WIP: Automatic view update rules |
Previous Message | Justin Pasher | 2009-01-16 23:29:35 | Re: Autovacuum daemon terminated by signal 11 |