Re: Big UPDATE breaking replication

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: Kouber Saparev <kouber(at)saparev(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Big UPDATE breaking replication
Date: 2013-06-04 15:47:40
Message-ID: 51AE0C1C.8050903@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 06/04/2013 04:53 AM, Kouber Saparev wrote:
> Hello,
>
> We are using the 9.1 built-in streaming replication.
>
> Recently our slave nodes fell behind because of an UPDATE statement. It
> took about 3 minutes to execute, but it affected half a million records,
> hence the replication broke with the "requested WAL segment ... has
> already been removed" series of error messages.
>
> The WAL settings we have are:
>
> max_wal_senders = 6
> wal_keep_segments = 60
> max_standby_archive_delay = 300s
>
>
> I guess increasing the wal_keep_segments value would prevent it from
> happening in the future, but increase it with how much? What value would
> be high enough?

You can use WAL shipping to protect against this or set
wal_keep_segments higher. I set my main server to a tad over 1,000 and
know I can do a full restore on the master without coming close to
breaking replication. My xlog dir is 17G. A bit of a waste, perhaps, but
I've noted no ill effects and it's still only a sliver of the total
drive capacity.

>
> Also we noticed some strange error message appearing shortly before and
> after this same statement: "LOG: out of file descriptors: Too many open
> files; release and retry".
>
> Could it be related somehow and what does it mean exactly?

What is your setup (Linux? Mac? Windows? VM in the cloud? How many
simultaneous connections?...) You will find a lot of info in old
messages on the subject but, annotating the docs: If the kernel is
enforcing a safe per-process limit, you don't need to worry about this
setting. But on some platforms (notably, most BSD systems - looking at
you Mac), the kernel will allow individual processes to open many more
files than the system can actually support if many processes all try to
open that many files....

An old email from Tom Lane notes that you need to make sure your kernel
can support approximately
max_connections * (max_files_per_process + max_connections) open file
handles plus any requirements imposed by other processes on the system
and comments that Mac treats each semaphore as an open file.

My interpretation is that if your kernel enforces things properly you
don't need to worry. If it doesn't, reduce your max_connections and/or
max_files_per_process as needed.

Cheers,
Steve

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message dx k9 2013-06-04 20:38:03 RPM 9.1
Previous Message Albe Laurenz 2013-06-04 12:38:35 Re: Big UPDATE breaking replication