From: | "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Separating bgwriter and checkpointer |
Date: | 2011-10-05 04:10:09 |
Message-ID: | CAHHcrepQNCN3HoXSWAOtsj3HJKp5NoepB2ky3dFgVA4JCXh-Qw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2011/10/4 Simon Riggs <simon(at)2ndquadrant(dot)com>:
> The problem is the *same* one I fixed in v2, yet now I see I managed
> to somehow exclude that fix from the earlier patch. Slap. Anyway,
> fixed again now.
Ah ok! I started reviewing the v4 patch version, this is my comments:
Submission review
===============
1. The patch applies cleanly to current master (fa56a0c3e) but isn't
in context diff format;
Feature test
==========
1. Since I patched and make installed it, I can see the expected
processes: writer and checkpointer;
2. I did the following tests with the following results:
2.1 Running a long time pgbench didn't emit any assertion failure or
crash and the checkpoint works as before patch:
LOG: checkpoint starting: xlog
LOG: checkpoint complete: wrote 300 buffers (9.8%); 0 transaction
log file(s) added, 0 removed, 0 recycled; write=26.103 s, sync=6.492
s, total=34.000 s; sync files=13, longest=4.684 s, average=0.499 s
LOG: checkpoint starting: time
LOG: checkpoint complete: wrote 257 buffers (8.4%); 0 transaction
log file(s) added, 0 removed, 3 recycled; write=21.819 s, sync=9.610
s, total=32.076 s; sync files=7, longest=6.452 s, average=1.372 s
2.2 Forcing a checkpoint when filesystem has enough free space works
fine while pgbench is running:
LOG: checkpoint starting: immediate force wait
LOG: checkpoint complete: wrote 1605 buffers (52.2%); 0 transaction
log file(s) added, 0 removed, 2 recycled; write=0.344 s, sync=22.750
s, total=23.700 s; sync files=10, longest=15.586 s, average=2.275 s
2.3 Forcing a checkpoint when filesystem are full, works as expected:
LOG: checkpoint starting: immediate force wait time
ERROR: could not write to file "pg_xlog/xlogtemp.4380": Não há
espaço disponível no dispositivo
ERROR: checkpoint request failed
HINT: Consult recent messages in the server log for details.
STATEMENT: CHECKPOINT ;
...
ERROR: could not extend file "base/16384/16405": wrote only 4096 of
8192 bytes at block 10
HINT: Check free disk space.
STATEMENT: INSERT INTO pgbench_history (tid, bid, aid, delta,
mtime) VALUES (69, 3, 609672, -3063, CURRENT_TIMESTAMP);
PANIC: could not write to file "pg_xlog/xlogtemp.4528": Não há
espaço disponível no dispositivo
STATEMENT: END;
LOG: server process (PID 4528) was terminated by signal 6: Aborted
Then I freed some space and started it again and the server ran properly:
LOG: database system was shut down at 2011-10-05 00:46:33 BRT
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
...
LOG: checkpoint starting: immediate force wait
LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction
log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s,
total=0.181 s; sync files=0, longest=0.000 s, average=0.000 s
2.2 Running a pgbench and interrupting postmaster a few seconds later,
seems to work as expected, returning the output:
... cut ...
LOG: statement: SELECT abalance FROM pgbench_accounts WHERE aid = 148253;
^CLOG: statement: UPDATE pgbench_tellers SET tbalance = tbalance +
934 WHERE tid = 85;
DEBUG: postmaster received signal 2
LOG: received fast shutdown request
LOG: aborting any active transactions
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
... cut ...
LOG: disconnection: session time: 0:00:14.917 user=guedes
database=bench host=[local]
LOG: disconnection: session time: 0:00:14.773 user=guedes
database=bench host=[local]
DEBUG: server process (PID 1910) exited with exit code 1
DEBUG: server process (PID 1941) exited with exit code 1
LOG: shutting down
LOG: checkpoint starting: shutdown immediate
DEBUG: SlruScanDirectory invoking callback on pg_multixact/offsets/0000
DEBUG: SlruScanDirectory invoking callback on pg_multixact/members/0000
DEBUG: checkpoint sync: number=1 file=base/16384/16398 time=1075.227 msec
DEBUG: checkpoint sync: number=2 file=base/16384/16406 time=16.832 msec
DEBUG: checkpoint sync: number=3 file=base/16384/16397 time=12306.204 msec
DEBUG: checkpoint sync: number=4 file=base/16384/16397.1 time=2122.141 msec
DEBUG: checkpoint sync: number=5 file=base/16384/16406_fsm time=32.278 msec
DEBUG: checkpoint sync: number=6 file=base/16384/16385_fsm time=11.248 msec
DEBUG: checkpoint sync: number=7 file=base/16384/16388 time=11.083 msec
DEBUG: checkpoint sync: number=8 file=base/16384/11712 time=11.314 msec
DEBUG: checkpoint sync: number=9 file=base/16384/16397_vm time=11.103 msec
DEBUG: checkpoint sync: number=10 file=base/16384/16385 time=19.308 msec
DEBUG: attempting to remove WAL segments older than log file
000000010000000000000000
DEBUG: SlruScanDirectory invoking callback on pg_subtrans/0000
LOG: checkpoint complete: wrote 1659 buffers (54.0%); 0 transaction
log file(s) added, 0 removed, 0 recycled; write=0.025 s, sync=15.617
s, total=15.898 s; sync files=10, longest=12.306 s, average=1.561 s
LOG: database system is shut down
Then I started the server again and it ran properly.
Well, all the tests was running with the default postgresql.conf in my
laptop but I'll setup a more "real world" environment to test for
performance regression. Until now I couldn't notice any significant
difference in TPS before and after patch in a small environment. I'll
post something soon.
Best regards,
--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br
From | Date | Subject | |
---|---|---|---|
Next Message | daveg | 2011-10-05 04:12:22 | timezone buglet? |
Previous Message | Greg Stark | 2011-10-05 02:55:28 | Re: Inlining comparators as a performance optimisation |