From: | "Erik Rijkers" <er(at)xs4all(dot)nl> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741 |
Date: | 2012-05-26 09:21:34 |
Message-ID: | bdc45602108c468a8f9eb132f6a94248.squirrel@webmail.xs4all.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
pg 9.2 git master
AMD 8120 (8-core) / 6 GB memory / Centos 6.2
I have experimented a bit with dropping a table from master, then querying that table from a
sync-rep slave. It is a little worrying that this, the first test I tried, crashes the slave.
There are two instance on one machine, head1 (=master) and head2 (=sync-rep slave).
First, I generated a tab-separated file, a one off, to be used in the test:
echo "
copy (
select
repeat('X',20) as c1
, repeat('X',20) as c2
, repeat('X',20) as c3
, repeat('X',20) as c4
, repeat('X',20) as c5
from generate_series(1, 200000)
)
to stdout
csv delimiter E'\t';
" | $HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql -p 6564 -d testdb > dropload_copy.txt
That txt file is zipped, and the actual test consists of a bash while loop which
1. drops the table
2. loads the file into the table
3. Either:
a. nothing
b. does a select count(*) on the table
So, it repeats the following:
zcat dropload_copy.txt.gz \
| grep -v '^#' \
| $HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql -p 6564 -d testdb -c "
drop table if exists t;
create table t (
c1 text,
c2 text,
c3 text,
c4 text,
c5 text
);
copy t from stdin csv delimiter E'\t';
analyze t;";
PAUSE_DURATION=0
PSQL=$HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql
if [[ 0 -eq 1 ]]; # ON-OFF switch
then
echo "sleep $PAUSE_DURATION"
sleep $PAUSE_DURATION;
(
echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6564 -d
testdb # master
echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6565 -d
testdb # wal_receiver_01
#echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6566 -d
testdb # wal_receiver_02
) | grep -v '^$'
fi
This runs fine for hours on end, as long as the ON-OFF switch is disabled.
But when that if-block is added the client crashes after a while (sometimes almost immediately; it
never survives longer then 20 minutes):
2012-05-26 10:44:22.617 CEST 10274 ERROR: could not fsync file "base/21268/32807": No such file
or directory
2012-05-26 10:44:28.465 CEST 10274 ERROR: could not fsync file "base/21268/32867": No such file
or directory
2012-05-26 10:44:28.587 CEST 10270 FATAL: could not open file "base/21268/32994": No such file or
directory
2012-05-26 10:44:28.588 CEST 10270 CONTEXT: writing block 2508 of relation base/21268/32994
xlog redo multi-insert (init): rel 1663/21268/33006; blk 3117; 58 tuples
TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741)
2012-05-26 10:44:31.131 CEST 10269 LOG: startup process (PID 10270) was terminated by signal 6:
Aborted
2012-05-26 10:44:31.131 CEST 10269 LOG: terminating any other active server processes
Crazy scenario , I'll admit, but surely this shouldn't be able to crash the client?
I attach the logfiles of master(=head1) and slave (=head2). It show how the above ran for an hour
without problems (while the ON/OFF switch was disabled), and how the crash came quickly when I
switched it on (to add the select count(*) statements).
Erik Rijkers
Attachment | Content-Type | Size |
---|---|---|
logfile.head2 | application/octet-stream | 6.0 KB |
logfile.head1 | application/octet-stream | 4.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2012-05-26 10:40:11 | Re: Interrupting long external library calls |
Previous Message | Fujii Masao | 2012-05-26 04:45:35 | Re: No, pg_size_pretty(numeric) was not such a hot idea |