Quick Links

Re: 'full_page_writes=off' , VACUUM and crashing streaming slaves...

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Sean Chittenden <sean(at)chittenden(dot)org>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: 'full_page_writes=off' , VACUUM and crashing streaming slaves...
Date:	2012-10-07 16:08:36
Message-ID:	12579.1349626116@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Sean Chittenden <sean(at)chittenden(dot)org> writes:
>> If you've got the postmaster logs from this episode, it would be useful
>> to see what complaints got logged.

> The first crash scenario:

> Oct 5 15:00:24 db01 postgres[75852]: [6449-2] javafail(at)dbcluster 75852 0: STATEMENT: SELECT /* query */ FROM tbl AS this_ WHERE this_.user_id=$1
> Oct 5 15:00:24 db01 postgres[75852]: [6456-1] javafail(at)dbcluster 75852 0: ERROR: could not seek to end of file "base/16387/20013": Too many open files in system
> [snip - lots of could not seek to end of file errors. How does seek(2) consume a file descriptor??? ]

It doesn't, but FileSeek() might need to do an open if the file wasn't
currently open. This isn't that surprising.

> Oct 5 15:00:25 db01 postgres[76648]: [5944-1] javafail(at)dbcluster 76648 0: FATAL: pipe() failed: Too many open files in system

This message must be coming from initSelfPipe(), and after poking around
a bit I think the failure must be occurring while a new backend is
attempting to do "OwnLatch(&MyProc->procLatch)" in InitProcess. The
reason the postmaster treats this as a crash is that the new backend
just armed the dead-man switch (MarkPostmasterChildActive) but it exits
without doing ProcKill which would disarm it. So this is just an
order-of-operations bug in InitProcess: we're assuming that it can't
fail before reaching "on_shmem_exit(ProcKill, 0)", and the latch
additions broke that. (Though looking at it, assuming that the
PGSemaphoreReset call cannot fail seems a tad risky too.)

So that explains the crashes, but it doesn't (directly) explain why you
had data corruption.

I think the uninitialized pages are showing up because you had crashes
in the midst of relation-extension operations, ie, some other backend
had successfully done an smgrextend but hadn't yet laid down any valid
data in the new page. However, this theory would not explain more than
one uninitialized page per crash, and your previous message seems to
show rather a lot of uninitialized pages. How many pipe-failed crashes
did you have?

> What's odd to me is not the failure scenarios that come from a system running out of FDs (though seek(2)'ing consuming an FD seems odd), it's more that it's still possible for a master DB's VACUUM to clean up a bogus or partial page write, and have the slave crash when the WAL entry is shipped over.

It looks to me like vacuumlazy.c doesn't bother to emit a WAL record at
all when fixing an all-zeroes heap page. I'm not sure if that's a
problem or not. The first actual use of such a page ought to result in
re-init'ing it anyway (cf XLOG_HEAP_INIT_PAGE logic in heap_insert),
so right offhand I don't see a path from this to the slave-side failures
you saw. (But on the other hand I'm only firing on one cylinder today
because of a head cold, so maybe I'm just missing it.)

Do the slave-side failures correspond to pages that were reported as
"fixed" on the master?

regards, tom lane

In response to

Re: 'full_page_writes=off' , VACUUM and crashing streaming slaves... at 2012-10-07 00:58:24 from Sean Chittenden

Responses

Re: 'full_page_writes=off' , VACUUM and crashing streaming slaves... at 2012-10-07 22:19:41 from Sean Chittenden
Re: 'full_page_writes=off' , VACUUM and crashing streaming slaves... at 2012-10-11 03:03:26 from Sean Chittenden

Browse pgsql-general by date

	From	Date	Subject
Next Message	John R Pierce	2012-10-07 17:41:00	Re: [Mac OS X Mountain Lion] FATAL: could not create shared memory segment: Cannot allocate memory
Previous Message	shehab adean sherif	2012-10-07 15:07:02	postgres 9.1 pgsql_tmp directory location of specific database?