From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Subject: | Re: failure in 019_replslot_limit |
Date: | 2024-02-09 18:59:15 |
Message-ID: | 20240209185915.btlqlp6of3zc6qxi@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2024-02-09 18:00:01 +0300, Alexander Lakhin wrote:
> I've managed to reproduce this issue (which still persists:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-02-04%2001%3A53%3A44
> ) and saw that it's not checkpointer, but walsender is hanging:
How did you reproduce this?
> And I see the walsender process still running (I've increased the timeout
> to keep the test running and to connect to the process in question), with
> the following stack trace:
> #0 0x00007fe4feac3d16 in epoll_wait (epfd=5, events=0x55b279b70f38,
> maxevents=1, timeout=timeout(at)entry=-1) at
> ../sysdeps/unix/sysv/linux/epoll_wait.c:30
> #1 0x000055b278b9ab32 in WaitEventSetWaitBlock
> (set=set(at)entry=0x55b279b70eb8, cur_timeout=cur_timeout(at)entry=-1,
> occurred_events=occurred_events(at)entry=0x7ffda5ffac90,
> nevents=nevents(at)entry=1) at latch.c:1571
> #2 0x000055b278b9b6b6 in WaitEventSetWait (set=0x55b279b70eb8,
> timeout=timeout(at)entry=-1,
> occurred_events=occurred_events(at)entry=0x7ffda5ffac90,
> nevents=nevents(at)entry=1, wait_event_info=wait_event_info(at)entry=100663297) at
> latch.c:1517
> #3 0x000055b278a3f11f in secure_write (port=0x55b279b65aa0,
> ptr=ptr(at)entry=0x55b279bfbd08, len=len(at)entry=21470) at be-secure.c:296
> #4 0x000055b278a460dc in internal_flush () at pqcomm.c:1356
> #5 0x000055b278a461d4 in internal_putbytes (s=s(at)entry=0x7ffda5ffad3c "E\177", len=len(at)entry=1) at pqcomm.c:1302
So it's the issue that we wait effectively forever to to send a FATAL. I've
previously proposed that we should not block sending out fatal errors, given
that allows clients to do prevent graceful restarts and a lot of other things.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andrey Borodin | 2024-02-09 19:02:08 | Re: glibc qsort() vulnerability |
Previous Message | Andres Freund | 2024-02-09 18:50:53 | Re: POC: Extension for adding distributed tracing - pg_tracing |