From: | Sean Laurent <sean(at)studyblue(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections |
Date: | 2011-10-11 22:38:26 |
Message-ID: | CAK=aZ=kBGOYkxYjNppec4dTg6SocyD0NW6y-t8H57PaQrL+90Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Oct 7, 2011 at 12:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Sean Laurent <sean(at)studyblue(dot)com> writes:
> > We've been running into a particularly strange problem that I'm trying to
> > better understand. The super short version is that our application servers
> > lose their connection to the database when I run a backup during periods of
> > higher load and fail to reconnect.
>
> That's just weird. It sounds like the "xfs_freeze" operation, or the
> snapshotting operation, is somehow interrupting network traffic. I'd
> not expect such a thing on a normal server, but who knows what's
> connected to what in an Amazon EC2 instance?
>
> Anyway, I'd suggest trying to instrument something to prove or disprove
> that there's a networking failure involved. It might be as simple as
> watching "ping" behavior ...
Agreed that's it very weird. EBS volumes are effectively networked
attached storage, so blaming network connectivity was my first
inclination as well. Unfortunately, it's definitely not a network
failure:
- AWS support team has not detected any network outages affecting the
EC2 instance or the EBS volumes at any time remotely near when our
outages occurred.
- I can consistently ping the database instance from the application
servers while the problem is occurring.
- I can SSH into the database instance and access Postgres while the
problem is occurring.
--
Sean Laurent
Director of Operations
StudyBlue, Inc.
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Abbate | 2011-10-11 22:40:44 | Re: how to save primary key constraints |
Previous Message | Harvey, Allan AC | 2011-10-11 22:08:19 | Re: Should casting to integer produce same result as trunc() |