BUG #17519: I get a segmentation fault when querying in parallel.

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: zhuangxiaodong0709(at)gmail(dot)com
Subject: BUG #17519: I get a segmentation fault when querying in parallel.
Date: 2022-06-16 06:17:06
Message-ID: 17519-59c6d6f4afd7fa0f@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17519
Logged by: zhuang xiaodong
Email address: zhuangxiaodong0709(at)gmail(dot)com
PostgreSQL version: 13.1
Operating system: CentOS 7.2
Description:

Hello Postgres team,

I get a segmentation fault when querying in parallel.
when I set the max_parallel_workers_per_gather = 0, the issue was gone. so I
suspect it's caused by the parallelism.

System Configuration
---------------------
Architecture : Intel Pentium
Operating System : CentOS 7.2
PostgreSQL version : PostgreSQL-(12.6, 12.7, 13.2, 13.3)

Please enter a FULL description of your problem:
------------------------------------------------

error log:
2021-11-24 07:33:11 UTC 00000 LOG:
00000: background worker "parallel worker" (PID 50844) was terminated by
signal 11: Segmentation fault
2021-11-24 07:33:11 UTC 00000
DETAIL: Failed process was running: SELECT
"tasks"."id","tasks"."created_by","tasks"."created_at","tasks"."updated_by","tasks"."updated_at","tasks"."assignee_ids","tasks"."board_id","tasks"."category_id","tasks"."code_name","tasks"."contact_info","tasks"."description","tasks"."end_datetime","tasks"."is_archived"
,"tasks"."location","tasks"."parent_task_id","tasks"."position","tasks"."section","tasks"."stage_id","tasks"."start_datetime","tasks"."status","tasks"."tag_names","tasks"."title","tasks"."source","tasks"."article_
issue_date","tasks"."article_suggest_to_print","tasks"."article_suggested_print_page","tasks"."article_summary","tasks"."remarks","tasks"."workflow_status"
FROM "tasks" LEFT JOIN tasks pt ON tasks.parent_task_id =
pt.id LEFT JOIN (select task_id from task_articles group by task_id) ta on
ta.task_id = COALESCE(pt.id, tasks.id) WHERE tasks.board_id in ($1) AND
((NOT(tasks.start_datetime > $2 OR tasks.end_datetime < $3)) OR (
ta is not null AND ((tasks.source != 'collaborate_with' AND
tasks.article_issue_date between $4 and $5) OR
2021-11-24 07:33:11 UTC 00000
LOCATION: LogChildExit, postmaster.c:3769
2021-11-24 07:33:11 UTC 00000 LOG:
00000: terminating any other active server processes
2021-11-24 07:33:11 UTC 00000
LOCATION: HandleChildCrash, postmaster.c:3487

when I gdb the core, the stack looks like this:
(gdb) bt
#0 FreeTupleDesc (tupdesc=0x7f4e2904a101) at tupdesc.c:325
#1 0x0000000000914845 in ShutdownTupleDescRef (arg=33152600) at
execExprInterp.c:2025
#2 0x00000000008fb23a in ShutdownExprContext (isCommit=<optimized out>,
econtext=0x1effc38) at execUtils.c:1006
#3 FreeExprContext (econtext=0x1effc38, isCommit=<optimized out>) at
execUtils.c:429
#4 0x00000000008fb2a9 in FreeExecutorState (estate=0x1eff798) at
../../../src/include/nodes/pg_list.h:127
#5 0x000000000090dba1 in standard_ExecutorEnd () at execMain.c:513
#6 0x000000000096976c in ExecutorEnd (queryDesc=0x1f44848) at
execMain.c:467
#7 PortalCleanup (portal=<optimized out>) at portalcmds.c:305
#8 0x0000000000561311 in PortalDrop () at portalmem.c:501
#9 0x00000000005618ff in PreCommit_Portals () at portalmem.c:749
#10 0x0000000000a80fc7 in CommitTransaction.lto_priv.0 () at xact.c:2096
#11 0x0000000000a81f75 in CommitTransactionCommand () at xact.c:3098
#12 0x000000000071dcda in finish_xact_command () at postgres.c:2825
#13 0x0000000000720e12 in PostgresMain (argc=<optimized out>,
argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at
postgres.c:4703
#14 0x00000000007c761a in BackendRun (port=<optimized out>, port=<optimized
out>) at postmaster.c:4657
#15 BackendStartup (port=0x1a7cb00) at postmaster.c:4341
#16 ServerLoop () at postmaster.c:1755
#17 0x00000000007c8501 in PostmasterMain () at postmaster.c:1428
#18 0x00000000004f205f in main (argc=3, argv=0x1a57950) at main.c:210
(gdb) p tupdesc
$1 = (struct TupleDescData *) 0x7f4e2904a101
(gdb) p tupdesc->constr
Cannot access memory at address 0x7f4e2904a111
(gdb) p tupdesc->constr->num_defval
Cannot access memory at address 0x7f4e2904a111

so I see the same manifestation of this case but with different stack, which
is fixed in 13.4:
Fix race condition in code for sharing tuple descriptors across parallel
workers (Thomas Munro)

need your generous support,thanks

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2022-06-16 13:25:44 BUG #17520: 14.4 release notes doc is out of order
Previous Message Marco Boeringa 2022-06-16 06:10:12 "SELECT COUNT(*) FROM" still causing issues (deadlock) in PostgreSQL 14.3?