Problem with data corruption and psql memory usage

From: Gerhard Wiesinger <lists(at)wiesinger(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Problem with data corruption and psql memory usage
Date: 2007-05-08 19:42:27
Message-ID: Pine.LNX.4.64.0705072030170.5094@bbs.intern
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello!

I'm new to Postgresql and I did make some import with about 2.8
Mio with normal insert commands.

Config was (difference from default config):
listen_addresses = '*'
temp_buffers = 20MB # min 800kB
work_mem = 20MB # min 64kB
maintenance_work_mem = 32MB # min 1MB
fsync = off # turns forced synchronization on or off
full_page_writes = off
wal_buffers = 20MB

It crashed with a core dump (ulimit -c 0):
LOG: server process (PID 12720) was terminated by signal 11
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server proc
ess exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server proc
ess exited abnormally and possibly corrupted shared memory.

Afterwards I got the following error messages:
WARNING: index "table_pkey" contains 2572948 row versions, but
table contains 2572949 row versions
HINT: Rebuild the index with REINDEX.

LOG: server process (PID 13794) was terminated by signal 11
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server proc
ess exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.

LOG: could not fsync segment 0 of relation 1663/16386/42726: Input/output
error
ERROR: storage sync failed on magnetic disk: Input/output error

ERROR: could not access status of transaction 808464434
DETAIL: Could not open file "pg_clog/0303": No such file or directory.

Afterwards I got:
ERROR: could not access status of transaction 5526085

There were also some coredumps afterwards where I have a stack trace:
#0 0x0807d241 in heap_deform_tuple ()
#1 0x08095b8c in toast_delete ()
#2 0x0809432e in heap_delete ()
#3 0x0814bfa4 in ExecutorRun ()
#4 0x081d7ece in FreeQueryDesc ()
#5 0x081d80c1 in FreeQueryDesc ()
#6 0x081d8979 in PortalRun ()
#7 0x081d4480 in pg_parse_query ()
#8 0x081d5a57 in PostgresMain ()
#9 0x081ad4fe in ClosePostmasterPorts ()
#10 0x081ae307 in PostmasterMain ()
#11 0x0816dec0 in main ()

So my questions are:
1.) Are my settings to aggresive (fsync=off, full_page_writes=off)?
2.) Should PostgreSQL also recover with these 2 options enabled on a core
dump or is data corruption normally with these settings?
3.) Any ideas for the reason of coredumps?

Write access was only from one session at a time. I only did select
count(*) from table from other sessions.

Afterwards I cleaned up the tables, pg_dumpall/restore session,
initdb and disabled these 2 settings and everything went fine.

I also had a problem with psql:
psql < file.sql
=> psql took around 2GB virtual memory with heavy swapping. After
Ctrl-C, restarting, it worked well. Any ideas?

Machine is stable so I would say that a hardware failure is not the
problem.

Postgresql version is 8.2.3 on FC6

Thank you for the answer.

Ciao,
Gerhard

--
http://www.wiesinger.com/

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Björn Lundin 2007-05-08 19:57:31 Re: PostgreSql embedded available?
Previous Message Ron Johnson 2007-05-08 19:39:07 Re: Server specs to run PostgreSQL