Re: Hanging backends and possible index corruption

From: Bernd Helmle <mailings(at)oopsware(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hanging backends and possible index corruption
Date: 2013-01-26 10:58:19
Message-ID: 7FC07A3E72CC7DA6BC407FC1@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

--On 25. Januar 2013 20:37:32 -0500 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> Don't know how careful pgbtreecheck is. The pg_filedump output isn't
> very helpful because you filtered away the flags, so we can't tell if
> any of these pages are deleted. (If they are, the duplicate-looking
> links might not be errors, since we intentionally don't reset a deleted
> page's left/right links when deleting it.)
>

Ah, wasn't aware of this.

> Could we see the whole special-space dump for each of the pages you're
> worried about?
>

Attached

> One thought that occurs to me is that POWER is a weak-memory-ordering
> architecture, so that it's a tenable idea that this has something to do
> with changing page links while not holding sufficient lock on the page.
> I don't see btree doing that anywhere, but ...
>
> BTW, how long has this installation been around, and when did you start
> seeing funny behavior? Can you say with reasonable confidence that the
> bug was *not* present in any older PG versions?
>

This machine started in production around august last year with (afair)
9.1.5. There also were performance and stress tests on this machine before
it went into production, with no noticable problems.

However, what i missed before is that there were some trouble with the
storage multipaths. Seems early january the machine lost some of it's paths
to the SAN, but they were recovered a few seconds later...so i cannot
exclude this as the cause anymore, though the paths are redundant. What
strikes me is that the index was recreated in the meantime after this
issue...

We will watch this machine the next couple of weeks more closely, if the
issue comes back again.

--
Thanks

Bernd

Attachment Content-Type Size
dump.txt.bz2 application/octet-stream 8.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-01-26 10:58:32 Re: proposal: a width specification for s specifier (format function), fix behave when positional and ordered placeholders are used
Previous Message Pavel Stehule 2013-01-26 09:02:55 Re: [PATCH] pg_isready (was: [WIP] pg_ping utility)