Re: buffer assertion tripping under repeat pgbench load

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: buffer assertion tripping under repeat pgbench load
Date: 2012-12-26 18:33:39
Message-ID: 19837.1356546819@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Smith <greg(at)2ndQuadrant(dot)com> writes:
> To try and speed up replicating this problem I switched to a smaller
> database scale, 100, and I was able to get a crash there. Here's the
> latest:

> 2012-12-26 00:01:19 EST [2278]: WARNING: refcount of base/16384/57610
> blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
> 2012-12-26 00:01:19 EST [2278]: WARNING: buffers with non-zero refcount
> is 1
> TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line:
> 1720)

> That's the same weird 1073741824 count as before. I was planning to
> dump some index info, but then I saw this:

> $ psql -d pgbench -c "select relname,relkind,relfilenode from pg_class
> where relfilenode=57610"
> relname | relkind | relfilenode
> ------------------+---------+-------------
> pgbench_accounts | r | 57610

> Making me think this isn't isolated to being an index problem.

Yeah, that destroys my theory that there's something broken about index
management specifically. Now we're looking for something that can
affect any buffer's refcount, which more than likely means it has
nothing to do with the buffer's contents ...

> I tried
> to soldier on with pg_filedump anyway. It looks like the last version I
> saw there (9.2.0 from November) doesn't compile anymore:

Meh, looks like it needs fixes for Heikki's int64-xlogrecoff patch.
I haven't gotten around to doing that yet, but would gladly take a
patch if anyone wants to do it. However, I now doubt that examining
the buffer content will help much on this problem.

Now that we know the bug's reproducible on smaller instances, could you
put together an exact description of what you're doing to trigger
it? What is the DB configuration, pgbench parameters, etc?

Also, it'd be worthwhile to just repeat the test a few more times
to see if there's any sort of pattern in which buffers get affected.
I'm now suspicious that it might not always be just one buffer,
for example.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message anarazel@anarazel.de 2012-12-26 18:58:54 Re: buffer assertion tripping under repeat pgbench load
Previous Message Greg Smith 2012-12-26 17:54:46 Re: buffer assertion tripping under repeat pgbench load