Re: uninitialized values in revised prepared xact code

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: uninitialized values in revised prepared xact code
Date: 2014-10-11 19:20:42
Message-ID: 20141011192042.GE21267@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Uh, was this fixed. I see a cleanup commit for this C file, but this
report is from June:

commit 07a4a93a0e35a778c77ffbbbc18de29e859e18f0
Author: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
Date: Fri May 16 09:47:50 2014 +0300

Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.

Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but missed
that prepared transaction commits reuse that struct. Fix that.

Because those fields were left unitialized, replaying a commit prepared WAL
record in a hot standby node would fail to remove the relcache init file.
That can lead to "could not open file" errors on the standby. Relcache init
file only needs to be removed when a system table/index is rewritten in the
transaction using two phase commit, so that should be rare in practice. In
HEAD, the incorrect dbId/tsId values are also used for filtering in logical
replication code, causing the transaction to always be filtered out.

Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was
introduced.

---------------------------------------------------------------------------

On Mon, Jun 30, 2014 at 11:58:59AM +0200, Andres Freund wrote:
> Hi,
>
> I've just rerun valgrind for the first time in a while and saw the
> following splat. My guess is it exists since bb38fb0d43c, but that's
> blindly guessing:
>
> ==2049== Use of uninitialised value of size 8
> ==2049== at 0x4FE66D: EndPrepare (twophase.c:1063)
> ==2049== by 0x4F231B: PrepareTransaction (xact.c:2217)
> ==2049== by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
> ==2049== by 0x79013E: finish_xact_command (postgres.c:2408)
> ==2049== by 0x78DE97: exec_simple_query (postgres.c:1062)
> ==2049== by 0x791FDD: PostgresMain (postgres.c:4010)
> ==2049== by 0x71B13B: BackendRun (postmaster.c:4113)
> ==2049== by 0x71A86D: BackendStartup (postmaster.c:3787)
> ==2049== by 0x71714C: ServerLoop (postmaster.c:1566)
> ==2049== by 0x716804: PostmasterMain (postmaster.c:1219)
> ==2049== by 0x679405: main (main.c:219)
> ==2049== Uninitialised value was created by a stack allocation
> ==2049== at 0x4FE16C: StartPrepare (twophase.c:942)
> ==2049==
> ==2049== Syscall param write(buf) points to uninitialised byte(s)
> ==2049== at 0x5C69640: __write_nocancel (syscall-template.S:81)
> ==2049== by 0x4FE6AE: EndPrepare (twophase.c:1064)
> ==2049== by 0x4F231B: PrepareTransaction (xact.c:2217)
> ==2049== by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
> ==2049== by 0x79013E: finish_xact_command (postgres.c:2408)
> ==2049== by 0x78DE97: exec_simple_query (postgres.c:1062)
> ==2049== by 0x791FDD: PostgresMain (postgres.c:4010)
> ==2049== by 0x71B13B: BackendRun (postmaster.c:4113)
> ==2049== by 0x71A86D: BackendStartup (postmaster.c:3787)
> ==2049== by 0x71714C: ServerLoop (postmaster.c:1566)
> ==2049== by 0x716804: PostmasterMain (postmaster.c:1219)
> ==2049== by 0x679405: main (main.c:219)
> ==2049== Address 0x64694ed is 1,389 bytes inside a block of size 8,192 alloc'd
> ==2049== at 0x4C27B8F: malloc (vg_replace_malloc.c:298)
> ==2049== by 0x8E766E: AllocSetAlloc (aset.c:853)
> ==2049== by 0x8E8E04: MemoryContextAllocZero (mcxt.c:627)
> ==2049== by 0x8A54D3: AtStart_Inval (inval.c:704)
> ==2049== by 0x4F1DFC: StartTransaction (xact.c:1841)
> ==2049== by 0x4F28D1: StartTransactionCommand (xact.c:2529)
> ==2049== by 0x7900A7: start_xact_command (postgres.c:2383)
> ==2049== by 0x78DAF4: exec_simple_query (postgres.c:860)
> ==2049== by 0x791FDD: PostgresMain (postgres.c:4010)
> ==2049== by 0x71B13B: BackendRun (postmaster.c:4113)
> ==2049== by 0x71A86D: BackendStartup (postmaster.c:3787)
> ==2049== by 0x71714C: ServerLoop (postmaster.c:1566)
> ==2049== Uninitialised value was created by a stack allocation
> ==2049== at 0x4FE16C: StartPrepare (twophase.c:942)
>
> It's probably just padding - twophase.c:1063 is the CRC32 computation of
> the record data.
>
> Greetings,
>
> Andres Freund
>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-10-11 20:38:22 Re: No toast table for pg_shseclabel but for pg_seclabel
Previous Message Bruce Momjian 2014-10-11 18:24:19 Re: Shared memory changes in 9.4?