Re: FATAL: could not read statistics message

From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: Tony Wasson <ajwasson(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: FATAL: could not read statistics message
Date: 2006-05-17 00:49:30
Message-ID: 446A731A.1060601@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tony Wasson wrote:
> On 5/16/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> "Tony Wasson" <ajwasson(at)gmail(dot)com> writes:
>> > When I saw the same error as you, the stats collector process was
>> > missing.
>>
>> The collector, or the buffer process? The reported message would be
>> emitted by the buffer process, after which it would immediately exit.
>> (The collector would go away too once it noticed EOF on its input.)
>> By and by the postmaster should start a fresh pair of processes.
>
>
> The stats collector was dead and would not respawn. Our options seemed
> limited to restarting postmaster or ignoring the error.
>
> Here was what the process list looked like:
>
> kangaroo:~ twasson$ ps waux | grep post
> pgsql 574 0.0 -0.0 460104 832 p0 S Wed06AM 10:26.98
> /usr/local/pgsql/bin/postmaster -D /Volumes/Vol0/pgsql-data
> pgsql 578 0.0 -5.2 460356 108620 p0 S Wed06AM 27:43.68
> postgres: writer process
> twasson 23844 0.0 -0.0 18172 688 std S+ 10:05AM 0:00.01
> grep post

That is what I recalled, also, though I wasn't meticulous enough to hang
onto the process list.

>> IIRC, the postmaster's spawning is rate-limited to once a minute,
>> so if the new buffer were immediately dying with the same error,
>> that would explain your observation of once-a-minute messages.
>>
>> This all still leaves us no closer to understanding *why* the recv()
>> is failing, though. What it does suggest is that the problem is a
>> hard, repeatable error when it does occur, which makes me loath to
>> put in the quick-fix "retry on EAGAIN" that I previously suggested.
>> If it is a hard error then that will just convert the problem into
>> a busy-loop that'll eat all your CPU cycles ... not much of an
>> improvement ...
>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Artz 2006-05-17 00:56:47 Re: bytea hex input/output
Previous Message Tony Wasson 2006-05-17 00:22:31 Re: FATAL: could not read statistics message