clang's sanitizer makes stringToNode() extremely slow

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: clang's sanitizer makes stringToNode() extremely slow
Date: 2024-04-18 15:00:00
Message-ID: eb0a5bbf-1dc8-ea26-e9c6-0bf8f190dd54@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

When using a server built with clang (15, 18) with sanitizers enabled,
the last query in the following script:
SET parallel_setup_cost = 0;
SET min_parallel_table_scan_size = 0;

SELECT a::text INTO t FROM generate_series(1, 1000) a;
\timing on
SELECT string_agg(a, ',') FROM t WHERE a = REPEAT('0', 400000);

runs for more than a minute on my workstation (a larger repeat count gives
much longer duration):
Time: 66017.594 ms (01:06.018)

The typical stack trace for a running parallel worker is:
#0  0x0000559a23671885 in __sanitizer::internal_strlen(char const*) ()
#1  0x0000559a236568ed in StrtolFixAndCheck(void*, char const*, char**, char*, int) ()
#2  0x0000559a236423dc in __interceptor_strtol ()
#3  0x0000559a240027d9 in atoi (...) at readfuncs.c:629
#5  0x0000559a23fb03f0 in _readConst () at readfuncs.c:275
#6  parseNodeString () at ./readfuncs.switch.c:29
#7  0x0000559a23faa421 in nodeRead (
    token=0x7fee75cf3bd2 "{CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false
:constisnull false :location -1 :constvalue 400004 [ 16 106 24 0 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48"...,
    tok_len=1) at read.c:338
#8  0x0000559a23faa916 in nodeRead (...) at read.c:452
#9  0x0000559a23fb3f34 in _readOpExpr () at ./readfuncs.funcs.c:279
#10 0x0000559a23fb0842 in parseNodeString () at ./readfuncs.switch.c:47
#11 0x0000559a23faa421 in nodeRead (...) at read.c:338
#12 0x0000559a23faa916 in nodeRead (...) at read.c:452
#13 0x0000559a23fefb74 in _readSeqScan () at ./readfuncs.funcs.c:3954
#14 0x0000559a23fb2b97 in parseNodeString () at ./readfuncs.switch.c:559
#15 0x0000559a23faa421 in nodeRead (...) at read.c:338
#16 0x0000559a23ffc033 in _readAgg () at ./readfuncs.funcs.c:4685
#17 0x0000559a23fb2dd3 in parseNodeString () at ./readfuncs.switch.c:611
#18 0x0000559a23faa421 in nodeRead (...) at read.c:338
#19 0x0000559a23feb340 in _readPlannedStmt () at ./readfuncs.funcs.c:3685
#20 0x0000559a23fb2ad1 in parseNodeString () at ./readfuncs.switch.c:541
#21 0x0000559a23faa421 in nodeRead (...) at read.c:338
#22 0x0000559a23fa99d8 in stringToNodeInternal (...) at read.c:92
#24 0x0000559a23d66609 in ExecParallelGetQueryDesc (...) at execParallel.c:1250
#25 ParallelQueryMain (...) at execParallel.c:1424
#26 0x0000559a238cfe13 in ParallelWorkerMain (...) at parallel.c:1516
#27 0x0000559a241e5b6a in BackgroundWorkerMain (...) at bgworker.c:848
#28 0x0000559a241ec254 in postmaster_child_launch (...) at launch_backend.c:265
#29 0x0000559a241f1c15 in do_start_bgworker (...) at postmaster.c:4270
#30 maybe_start_bgworkers () at postmaster.c:4501
#31 0x0000559a241f486e in process_pm_pmsignal () at postmaster.c:3774
#32 ServerLoop () at postmaster.c:1667
#33 0x0000559a241f0ed6 in PostmasterMain (...) at postmaster.c:1372
#34 0x0000559a23ebe16d in main (...) at main.c:197

The flamegraph (which shows that readDatum() -> __interceptor_strtol() ->
StrtolFixAndCheck() -> __sanitizer::internal_strlen () takes >99% of time)
is attached.
(I could not reproduce this behaviour with the gcc's sanitizer.)

Moreover, this query cannot be canceled, you can only kill -SIGKILL
parallel workers to interrupt it.

Best regards,
Alexander

Attachment Content-Type Size
perf-flamegraph.svg image/svg+xml 39.6 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2024-04-18 15:07:34 Re: pg17 issues with not-null contraints
Previous Message Nathan Bossart 2024-04-18 14:57:18 Re: improve performance of pg_dump --binary-upgrade