Re: There's some sort of race condition with the new FSM stuff

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org, books(at)ejurka(dot)com
Subject: Re: There's some sort of race condition with the new FSM stuff
Date: 2008-10-14 20:02:56
Message-ID: 48F4FAF0.9010306@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> Zdenek Kotala wrote:
>>> For security reason any OS should clean memory pages before process
>>> first touches them.
>
>> Yeah. But it doesn't necessarily need to fill them with zeros, any
>> garbage will do.
>
> Yeah, but the observed symptoms seem to indicate that the fill is mostly
> zeroes with a very occasional one. This seems less than probable.
>
> The only theory I've thought of that seems to fit the facts is that
> someplace we have a wild store that is clobbering that particular word.
> Which is a pretty unpleasant thought.

The bug only affected fsync/forget requests that are forwarded from
backends, not the ones that bgwriter puts into the hash table itself. If
the fsync request is queued by the bgwriter itself, and the forget
request comes from a backend, you get the error that we saw, I think. I
noted that kudu has a small shared_buffers setting, 5.6 MB, compared to
most buildfarm members, which might explain different behavior wrt.
which buffers are written by backends and which are written by bgwriter.

(kudu is green now BTW)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-10-14 20:08:41 Re: spoonbill is failing citext test
Previous Message Tom Lane 2008-10-14 19:53:16 Re: There's some sort of race condition with the new FSM stuff