From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | exclusion(at)gmail(dot)com |
Subject: | BUG #17950: Incorrect memory access in gtsvector_picksplit() |
Date: | 2023-05-29 20:00:01 |
Message-ID: | 17950-6c80a8d2b94ec695@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 17950
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 16beta1
Operating system: Ubuntu 22.04
Description:
The following script:
CREATE TABLE test_tsvector(t text, a tsvector);
SELECT 'COPY test_tsvector FROM ''.../src/test/regress/data/tsearch.data'';'
FROM generate_series(1, 19)
\gexec
CREATE TRIGGER tsvectorupdate
BEFORE UPDATE OR INSERT ON test_tsvector
FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(a,
'pg_catalog.english', t);
SELECT 'COPY test_tsvector FROM ''.../src/test/regress/data/tsearch.data'';'
FROM generate_series(1, 38)
\gexec
CREATE INDEX gistidx ON test_tsvector USING gist (a
tsvector_ops(siglen=1));
-- I believe it's not the only way to get a data pattern needed, so
-- probably the repro could be simplified, if desired.
triggers a Valgrind-detected memory access error:
==00:00:00:53.414 342514== Invalid read of size 1
==00:00:00:53.414 342514== at 0x79787D: pg_popcount (pg_bitutils.c:332)
==00:00:00:53.414 342514== by 0x6F93C6: sizebitvec (tsgistidx.c:488)
==00:00:00:53.414 342514== by 0x6FA24A: gtsvector_picksplit
(tsgistidx.c:731)
==00:00:00:53.414 342514== by 0x74146D: FunctionCall2Coll (fmgr.c:1132)
==00:00:00:53.414 342514== by 0x217E65: gistUserPicksplit
(gistsplit.c:433)
==00:00:00:53.414 342514== by 0x2184A5: gistSplitByKey
(gistsplit.c:697)
==00:00:00:53.414 342514== by 0x20C237: gistSplit (gist.c:1450)
==00:00:00:53.414 342514== by 0x20CA32: gistplacetopage (gist.c:309)
==00:00:00:53.414 342514== by 0x20DBFB: gistinserttuples (gist.c:1278)
==00:00:00:53.414 342514== by 0x20DF85: gistfinishsplit (gist.c:1376)
==00:00:00:53.414 342514== by 0x20DC79: gistinserttuples (gist.c:1305)
==00:00:00:53.414 342514== by 0x20E39E: gistinserttuple (gist.c:1231)
==00:00:00:53.414 342514== Address 0x7386c68 is 16,024 bytes inside a block
of size 16,384 alloc'd
==00:00:00:53.414 342514== at 0x4848899: malloc
(vg_replace_malloc.c:381)
==00:00:00:53.414 342514== by 0x7602C7: AllocSetAlloc (aset.c:924)
==00:00:00:53.414 342514== by 0x76C26A: palloc (mcxt.c:1240)
==00:00:00:53.414 342514== by 0x20C24A: gistSplit (gist.c:1453)
==00:00:00:53.414 342514== by 0x20CA32: gistplacetopage (gist.c:309)
==00:00:00:53.414 342514== by 0x20DBFB: gistinserttuples (gist.c:1278)
==00:00:00:53.414 342514== by 0x20E39E: gistinserttuple (gist.c:1231)
==00:00:00:53.414 342514== by 0x20E940: gistdoinsert (gist.c:886)
==00:00:00:53.414 342514== by 0x21152B: gistBuildCallback
(gistbuild.c:929)
==00:00:00:53.414 342514== by 0x24420D: heapam_index_build_range_scan
(heapam_handler.c:1708)
==00:00:00:53.414 342514== by 0x2119E4: table_index_build_scan
(tableam.h:1781)
==00:00:00:53.414 342514== by 0x2119E4: gistbuild (gistbuild.c:317)
==00:00:00:53.414 342514== by 0x2E06F5: index_build (index.c:3032)
==00:00:00:53.414 342514==
(Several runs might be required for the issue reproduction.)
With the additional debug logging in gtsvector_picksplit():
@@ -722,6 +722,11 @@ gtsvector_picksplit(PG_FUNCTION_ARGS)
continue;
}
+if (!cache[j].allistrue) {
+elog(LOG, "!!!gtsvector_picksplit| j: %d, cache[j].sign: %p,
GETSIGN(cache[j].sign): %p", j, cache[j].sign, GETSIGN(cache[j].sign));
+VALGRIND_CHECK_MEM_IS_DEFINED(GETSIGN(cache[j].sign), siglen);
+}
+
if (ISALLTRUE(datum_l) || cache[j].allistrue)
{
if (ISALLTRUE(datum_l) && cache[j].allistrue)
(and #include "utils/memdebug.h")
I see:
cache[j].sign: 0x723a999, GETSIGN(cache[j].sign): 0x723a9a1
==00:00:00:18.356 351519== Unaddressable byte(s) found during client check
request
==00:00:00:18.356 351519== at 0x6FA2BE: gtsvector_picksplit
(tsgistidx.c:727)
...
==00:00:00:18.356 351519== Address 0x723a9a1 is 4,977 bytes inside a block
of size 8,192 alloc'd
Reproduced starting from 911e70207.
But with the slight variation:
CREATE INDEX gistidx ON test_tsvector USING gist (a tsvector_ops);
and the debugging patch:
@@ -711,6 +712,9 @@ gtsvector_picksplit(PG_FUNCTION_ARGS)
else
size_alpha = hemdistsign(cache[j].sign, GETSIGN(datum_l));
+if (!cache[j].allistrue)
+VALGRIND_CHECK_MEM_IS_DEFINED(GETSIGN(cache[j].sign), SIGLEN);
+
if (ISALLTRUE(datum_r) || cache[j].allistrue)
{
if (ISALLTRUE(datum_r) && cache[j].allistrue)
it's reproduced even on 911e70207~1:
==00:00:00:15.858 370963== Unaddressable byte(s) found during client check
request
==00:00:00:15.858 370963== at 0x636E0E: gtsvector_picksplit
(tsgistidx.c:716)
==00:00:00:15.858 370963== by 0x67B6D3: FunctionCall2Coll (fmgr.c:1162)
==00:00:00:15.858 370963== by 0x1FB662: gistUserPicksplit
(gistsplit.c:433)
==00:00:00:15.858 370963== by 0x1FBCCF: gistSplitByKey
(gistsplit.c:697)
==00:00:00:15.858 370963== by 0x1F088A: gistSplit (gist.c:1441)
==00:00:00:15.858 370963== by 0x1F0F3C: gistplacetopage (gist.c:302)
==00:00:00:15.858 370963== by 0x1F202F: gistinserttuples (gist.c:1270)
==00:00:00:15.858 370963== by 0x1F273A: gistinserttuple (gist.c:1223)
==00:00:00:15.858 370963== by 0x1F2BCE: gistdoinsert (gist.c:879)
==00:00:00:15.858 370963== by 0x1F4E8F: gistBuildCallback
(gistbuild.c:470)
==00:00:00:15.858 370963== by 0x2262BD: heapam_index_build_range_scan
(heapam_handler.c:1659)
==00:00:00:15.858 370963== by 0x1F50E7: table_index_build_scan
(tableam.h:1540)
==00:00:00:15.858 370963== by 0x1F50E7: gistbuild (gistbuild.c:196)
==00:00:00:15.858 370963== Address 0x11a3a68b is 4,939 bytes inside a block
of size 16,384 alloc'd
==00:00:00:15.858 370963== at 0x4848899: malloc
(vg_replace_malloc.c:381)
==00:00:00:15.858 370963== by 0x699954: AllocSetAlloc (aset.c:941)
==00:00:00:15.858 370963== by 0x6A26CC: palloc (mcxt.c:963)
==00:00:00:15.858 370963== by 0x636990: gtsvector_picksplit
(tsgistidx.c:613)
...
So it looks like this defect exists in core since 140d4ebcb.
IIUC, using the GETSIGN macro with cache[j].sign is a mistake -- it
erroneously adds 8 to an address of the sign field, so for the last j
it leads to an out-of-bounds memory read.
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2023-05-30 00:08:08 | Re: BUG #17940: PostgreSQL Installer For Windows Cannot Initialize Cluster When BeyondTrust Installed |
Previous Message | Federico Martella | 2023-05-29 15:11:26 | Error pg_dump |