RE: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs

From: Chengchao Yu <chengyu(at)microsoft(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Prabhat Tripathi <ptrip(at)microsoft(dot)com>, Sunil Kamath <Sunil(dot)Kamath(at)microsoft(dot)com>, Michal Primke <mprimke(at)microsoft(dot)com>, TEJA Mupparti <Tejeswar(dot)Mupparti(at)microsoft(dot)com>
Subject: RE: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs
Date: 2019-07-27 00:52:20
Message-ID: MW2PR2101MB0908059AACF61F4F4D0B2B1AAAC30@MW2PR2101MB0908.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Kyotaro,

Thank you so much for your valued feedback and suggestions!

> I assume that we are in a consensus about the problem we are to fix here.
>
> > 0a 00000004`8080cc30 00000004`80dcf917 postgres!PGSemaphoreLock+0x65
> > [d:\orcasqlagsea10\14\s\src\backend\port\win32_sema.c @ 158] 0b
> > 00000004`8080cc90 00000004`80db025c postgres!LWLockAcquire+0x137
> > [d:\orcasqlagsea10\14\s\src\backend\storage\lmgr\lwlock.c @ 1234] 0c
> > 00000004`8080ccd0 00000004`80db25db postgres!AbortBufferIO+0x2c
> > [d:\orcasqlagsea10\14\s\src\backend\storage\buffer\bufmgr.c @ 3995] 0d
> > 00000004`8080cd20 00000004`80dbce36 postgres!AtProcExit_Buffers+0xb
> > [d:\orcasqlagsea10\14\s\src\backend\storage\buffer\bufmgr.c @ 2479] 0e
> > 00000004`8080cd50 00000004`80dbd1bd postgres!shmem_exit+0xf6
> > [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 262] 0f
> > 00000004`8080cd80 00000004`80dbccfd postgres!proc_exit_prepare+0x4d
> > [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 188]
> > 10 00000004`8080cdb0 00000004`80ef9e74 postgres!proc_exit+0xd
> > [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 141]
> > 11 00000004`8080cde0 00000004`80ddb6ef postgres!errfinish+0x204
> > [d:\orcasqlagsea10\14\s\src\backend\utils\error\elog.c @ 624]
> > 12 00000004`8080ce50 00000004`80db0f59 postgres!mdread+0x12f
> > [d:\orcasqlagsea10\14\s\src\backend\storage\smgr\md.c @ 806]

Yes, this is one of the two situations we want to fix. The other situation is a cascade exception case like following.

#0 0x00007f0fdb7cb6d6 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0, futex_word=0x7f0fd14c81b8) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1 do_futex_wait (sem=sem(at)entry=0x7f0fd14c81b8, abstime=0x0) at sem_waitcommon.c:111
#2 0x00007f0fdb7cb7c8 in __new_sem_wait_slow (sem=0x7f0fd14c81b8, abstime=0x0) at sem_waitcommon.c:181
#3 0x00005630d475658a in PGSemaphoreLock (sema=0x7f0fd14c81b8) at pg_sema.c:316
#4 0x00005630d47f689e in LWLockAcquire (lock=0x7f0fd9ae9c00, mode=LW_EXCLUSIVE) at /path/to/postgres/source/build/../src/backend/storage/lmgr/lwlock.c:1243
#5 0x00005630d47cd087 in AbortBufferIO () at /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:3988
#6 0x00005630d47cb3f9 in AtProcExit_Buffers (code=1, arg=0) at /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2473
#7 0x00005630d47dbc32 in shmem_exit (code=1) at /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:272
#8 0x00005630d47dba5e in proc_exit_prepare (code=1) at /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:194
#9 0x00005630d47db9c6 in proc_exit (code=1) at /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:107
#10 0x00005630d49811bc in errfinish (dummy=0) at /path/to/postgres/source/build/../src/backend/utils/error/elog.c:541
#11 0x00005630d4801f1f in mdwrite (reln=0x5630d6588c68, forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7f0fd1ae9c00 "", skipFsync=false) at /path/to/postgres/source/build/../src/backend/storage/smgr/md.c:843
#12 0x00005630d4804716 in smgrwrite (reln=0x5630d6588c68, forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7f0fd1ae9c00 "", skipFsync=false) at /path/to/postgres/source/build/../src/backend/storage/smgr/smgr.c:650
#13 0x00005630d47cb824 in FlushBuffer (buf=0x7f0fd19e9c00, reln=0x5630d6588c68) at /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2751
#14 0x00005630d47cb219 in SyncOneBuffer (buf_id=0, skip_recently_used=false, wb_context=0x7ffccc371a70) at /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2394
#15 0x00005630d47cab00 in BufferSync (flags=6) at /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:1984
#16 0x00005630d47cb57f in CheckPointBuffers (flags=6) at /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2578
#17 0x00005630d44a685b in CheckPointGuts (checkPointRedo=23612304, flags=6) at /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:9149
#18 0x00005630d44a62cf in CreateCheckPoint (flags=6) at /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:8937
#19 0x00005630d44a45e3 in StartupXLOG () at /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:7723
#20 0x00005630d4995f88 in InitPostgres (in_dbname=0x5630d65582b0 "postgres", dboid=0, username=0x5630d653d7d0 "chengyu", useroid=0, out_dbname=0x0, override_allow_connections=false)
at /path/to/postgres/source/build/../src/backend/utils/init/postinit.c:636
#21 0x00005630d480b68b in PostgresMain (argc=7, argv=0x5630d6534d20, dbname=0x5630d65582b0 "postgres", username=0x5630d653d7d0 "chengyu") at /path/to/postgres/source/build/../src/backend/tcop/postgres.c:3810
#22 0x00005630d4695e8b in main (argc=7, argv=0x5630d6534d20) at /path/to/postgres/source/build/../src/backend/main/main.c:224
Though ENOSPC is avoided by reservation in PG, the other error code could be returned from OS to form this stack.

> Ok, we are fixing this. The proposed patch lets LWLockReleaseAll() called before
> InitBufferPoolBackend() by registering the former after the latter into on_shmem_exit
> list. Even if it works, I think it's neither clean nor safe to register multiple
> order-sensitive callbacks.

Actually I think the order of callbacks retains the order of how components got initialized. In the patch v2, the specific location requirement was for the cascade exception to work as well.
However, I think we can discuss about this as we just would like to ensure lw-locks are released before AbortBufferIO().

> And the only caller of it is shmem_exit. More of that, all other caller sites calls
> LWLockReleaseAll() just before calling it. If that's the case, why don't we just release
> all LWLocks in shmem_exit or in AtProcExit_Buffers before calling AbortBufferIO()? I think
> it's sufficient that AtProcExit_Buffers calls it at the beginning. (The comment for the
> funcgtion needs editing).

Putting LWLockReleaseAll() in AtProcExit_Buffers() is OK, however, it does not work for the cascade exception case if putting in shmem_exit().
Indeed putting LWLockReleaseAll() in AtProcExit_Buffers() was considered firstly, but as the other part of PG code base prefers putting in other callbacks (e.g. ShutdownAuxiliaryProcess() callback when UnderPostmaster is true), I just followed the same style in patch v2.
But after revisited the decision, I think I agree with you, because:
1. Yes, it looks cleaner in the code.
2. We can avoid the pain if people forgot or wrongly registered the additional callback.
3. Calling LWLockReleaseAll() for the second time is quite fast, it will not bring overburden to AtProcExit_Buffers()

Thus, I have updated the patch v3 according to your suggestions. Could you help to review again?
Please let me know should you have more suggestions or feedbacks.

Thank you!

Best regards,
--
Chengchao Yu
Software Engineer | Microsoft | Azure Database for PostgreSQL
https://azure.microsoft.com/en-us/services/postgresql/

Attachment Content-Type Size
fix-deadlock-v3.patch application/octet-stream 895 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-07-27 01:15:08 Re: Duplicated LSN in ReorderBuffer
Previous Message Tom Lane 2019-07-27 00:24:16 Re: TopoSort() fix