Quick Links

Re: does wal archiving block the current client connection?

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Jeff Frost <jeff(at)frostconsultingllc(dot)com>
Cc:	pgsql-admin(at)postgresql(dot)org
Subject:	Re: does wal archiving block the current client connection?
Date:	2006-05-15 22:46:32
Message-ID:	1147733193.5074.87.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin pgsql-hackers

On Mon, 2006-05-15 at 14:29 -0700, Jeff Frost wrote:
> On Mon, 15 May 2006, Simon Riggs wrote:
>
> > On Mon, 2006-05-15 at 09:28 -0700, Jeff Frost wrote:
> >> I've run into a problem with a PITR setup at a client. The problem is that
> >> whenever the CIFS NAS device that we're mounting at /mnt/pgbackup has
> >> problems
> >
> > What kind of problems?
>
> It becomes unwritable for whatever reason CIFS shares become unwritable. It's
> a windows 2003 NAS device and a reboot solves the problem, but it leaves no
> event logs on the windows side of things, so difficult to determine the root
> cause.

You should be able to re-create this problem without the database being
involved. Just set up a driver program over the top of the archive
script so it flies in a tighter loop than the archiver would make it. If
you still get the Windows NAS error... well, I'll leave that to you.

> >> , it seems that the current client connection gets blocked and this
> >> eventually builds up to a "sorry, too many clients already" error.

Tell us more about what the blockage looks like. We may yet thank
Windows for finding a bug, but I'm not sure yet.

> > This sounds like the archiver keeps waking up and trying the command,
> > but it fails, yet that request is causing a resource leak on the NAS.
> > Eventually, archiver retrying the command eventually fails. Or am I
> > misunderstanding your issues?
>
> that's possible. Does the archiver use a DB connection whenever it tries to
> run archive_command?

Not at all.

> If so, then that's almost certainly the problem. I
> suspect a faster timeout on the CIFS mount would fix the issue as well, but I
> didn't see any such options in the mount.cifs manpage.
>
> > The archiver is designed around the thought that *attempting* to archive
> > is a task that it can do indefinitely without a problem; its up to you
> > to spot that the link is down.
> >
> > We can put something in to make the retry period elongate, but you'd
> > need to put a reasonable case for how that would increase robustness.
>
> That all sounds perfectly reasonable. If the archiver is using up a
> connection for each archive_command issued, then I suspect that's our problem,
> as there were also lots of debug logs showing that the db was trying to
> archive several WAL files at near the same time, likely pushing us over our
> 100 connection limit.

Oh, you mean database clients cannot connect. I thought you meant you
were getting a CIFS client connection error from the archiver. That's
wierd.

> If the archiver does not use up a connection, then I
> suppose I don't know what's actually going on unless postgres blocks the
> commit of the transaction which triggered the archive_command until the
> archive command finishes (or fails).

I think you need to show the database log covering the period in error.

Are you running out of disk space in the database directory? Can you
check again that pg_xlog and pg_xlog/archive_status is definitely not on
the NAS?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Re: does wal archiving block the current client connection? at 2006-05-15 21:29:57 from Jeff Frost

Responses

Re: does wal archiving block the current client connection? at 2006-05-16 12:08:07 from Simon Riggs

Browse pgsql-admin by date

	From	Date	Subject
Next Message	kah_hang_ang	2006-05-16 01:35:57	Synchronize Backup to another remote database
Previous Message	Scott Marlowe	2006-05-15 21:39:16	Re: does wal archiving block the current client connection?

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2006-05-15 23:43:20	Re: Mention pg_dump version portability
Previous Message	Ron Mayer	2006-05-15 22:37:54	Re: Compression and on-disk sorting