Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint

From: Cameron Smith <csmith(at)stereodllc(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint
Date: 2016-05-19 19:15:49
Message-ID: CO2PR0801MB22144A06118F31215B6023AEA04A0@CO2PR0801MB2214.namprd08.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'd agree: most likely a file system problem. Is there any hope that this file could be re-built?

My current plan is to use bdr_part_by_node_names to remove the failing node and then rebuild it from a fresh backup (and probably on a new server).

Thank you for your help!

Cameron Smith

________________________________________
From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Sent: May 19, 2016 2:56 PM
To: Cameron Smith
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint

CAUTION EXTERNAL EMAIL

Cameron Smith wrote:

> t:2016-05-19 01:14:51.668 UTC d= p=144 a=PANIC: could not create replication identifier checkpoint "pg_logical/checkpoints/8-F3923F98.ckpt.tmp": Invalid argument

This line corresponds to the following code in BDR's 9.4.4
src/backend/replication/logical/replication_identifier.c:

/*
* no other backend can perform this at the same time, we're protected by
* CheckpointLock.
*/
tmpfd = OpenTransientFile(tmppath,
O_CREAT | O_EXCL | O_WRONLY | PG_BINARY,
S_IRUSR | S_IWUSR);
if (tmpfd < 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not create replication identifier checkpoint \"%s\": %m",
tmppath)));

This file does not exist in 9.5, but instead we have
src/backend/replication/logical/origin.c which has identical code.

OpenTransientFile calls BasicOpenFile, which in turn calls open() and
propagates the errno. My manpage doesn't list any possible reasons for
open() to return EINVAL, so I'm at a loss about what is happening here.
Maybe this is a filesystem problem?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
DO NOT open attachments or click on links from unknown senders or unexpected emails

This e-mail and any attachments are intended only for use by the addressee(s) named herein and may contain confidential information. If you are not the intended recipient of this e-mail, you are hereby notified any dissemination, distribution or copying of this email and any attachments is strictly prohibited. If you receive this email in error, please immediately notify the sender by return email and permanently delete the original, any copy and any printout thereof. The integrity and security of e-mail cannot be guaranteed.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Christoph Moench-Tegeder 2016-05-19 19:19:55 Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint
Previous Message David G. Johnston 2016-05-19 19:13:35 Re: PQcancel may hang in the recv call