Re: The segmentation fault of Postgresql 9.6.24

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Kevin Wang <kevinpgcloud(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: The segmentation fault of Postgresql 9.6.24
Date: 2023-12-28 22:20:18
Message-ID: 9c8b5912-9290-420c-57a4-c6185397c74a@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 12/28/23 21:09, Kevin Wang wrote:
> Hello hackers,
>
> Our prod databases are still PG 9.6.24.  We have one primary plus 3
> stream replications that are all working well for a long time.

Everything is working well until the day it breaks ...

> However, when I promoted one standby database to the primary role,
> we the the below error message from the PG log:
> =======================
> 2023-12-01 06:57:35.541 UTC,,,1553,,6569738f.611,639,,2023-12-01
> 05:47:59 UTC,,0,LOG,00000,"server process (PID 31839) was terminated by
> signal 11: Segmentation fault","Failed process was running: UPDATE xxxx
> SET employee_id = (9489910) WHERE id = (1162120221)",,,,,,,,""
>
>
>
> Here is the message from dmesg:
> =======================
> [ 3676.406247] postgres[27789]: segfault at 0 ip 00005618bf79bfe4 sp
> 00007ffcd9a75dc8 error 4 in postgres[5618bf3db000+3f7000]
> [ 3676.406265] Code: ff ff 48 83 c2 40 ff d0 e8 19 9c ff ff e8 44 0f c4
> ff 0f 1f 40 00 f3 0f 1e fa e9 27 be cc ff 0f 1f 80 00 00 00 00 f3 0f 1e
> fa <0f> b6 17 89 d1
>  83 e1 03 80 f9 02 74 0f 80 fa 01 74 0a 48 89 f8 c3
> [ 3715.937850] postgres[27928]: segfault at 0 ip 00005618bf79bfe4 sp
> 00007ffcd9a75dc8 error 4 in postgres[5618bf3db000+3f7000]
> [ 3715.937858] Code: ff ff 48 83 c2 40 ff d0 e8 19 9c ff ff e8 44 0f c4
> ff 0f 1f 40 00 f3 0f 1e fa e9 27 be cc ff 0f 1f 80 00 00 00 00 f3 0f 1e
> fa <0f> b6 17 89 d1
>  83 e1 03 80 f9 02 74 0f 80 fa 01 74 0a 48 89 f8 c3
> [ 3732.278367] postgres[28212]: segfault at 0 ip 00005618bf79bfe4 sp
> 00007ffcd9a75dc8 error 4 in postgres[5618bf3db000+3f7000]
> [ 3732.278384] Code: ff ff 48 83 c2 40 ff d0 e8 19 9c ff ff e8 44 0f c4
> ff 0f 1f 40 00 f3 0f 1e fa e9 27 be cc ff 0f 1f 80 00 00 00 00 f3 0f 1e
> fa <0f> b6 17 89 d1
>  83 e1 03 80 f9 02 74 0f 80 fa 01 74 0a 48 89 f8 c3
>
> Error 4 is the error related to unmapping memory. But the database works
> well for long time as the standby database. After it was promoted to the
> primary role, no memory parameter change at all.
>

Why do you think "4" means unmapping memory? 4 is error code for
"user-mode access" (i.e. not invalid memory access from kernel).

> Could you give us some hint where to fix this issue?
>

This could be pretty much anything, and without seeing where exactly it
fails it's impossible to say. I see you apparently hit the issue
repeatedly, and tall the information is *exactly* the same - addresses,
code, etc. Try decoding the addresses with addr2line, or even better get
a proper backtrace - either from a core file, or using gdb.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2023-12-28 22:40:31 Re: The segmentation fault of Postgresql 9.6.24
Previous Message Bruce Momjian 2023-12-28 22:07:40 Re: Pdadmin open on Macbook issue