From: | Hans-Jürgen Schönig <postgres(at)cybertec(dot)at> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Manfred Koizar <mkoi-pg(at)aon(dot)at>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: WAL replay failure after file truncation(?) |
Date: | 2005-05-27 13:59:40 |
Message-ID: | 429727CC.1000608@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> =?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres(at)cybertec(dot)at> writes:
>
>>My question is: What happens if the system is killed inside
>>rebuild_relation or inside swap_relfilenodes which is called by
>>rebuild_relation?
>
>
> Nothing at all, because the system catalog updates aren't committed yet,
> and we haven't done anything to the relation's old physical file.
This is actually what I expected.
I have gone through the code and it looks correct.
TRUNCATE is the only command in this application which can potentially
cause the problem (it is very unlikely that INSERT removes a file).
> If I were you I'd be looking into whether your disk hardware honors
> write ordering properly. This sounds like something allowed the
> directory change to reach disk before the transaction commit WAL record
> did; which is impossible if fsync is doing what it's supposed to.
>
> regards, tom lane
We are on sun Solaris (x86) box here. I am not sure what Sun has
corrupted to make this error happen. Obviously it happens only once per
1.000.000 tries ...
I am just trying to figure out whether the bug could potentially be
inside PostgreSQL. It would have been surprised if somebody had overseen
a problem like that.
many thanks and best regards,
Hans
--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-05-27 14:00:27 | Re: Cost of XLogInsert CRC calculations |
Previous Message | Tom Lane | 2005-05-27 13:56:27 | Re: WAL replay failure after file truncation(?) |