Bug: Reading from single byte character column type may cause out of bounds memory reads.

From: Spyridon Dimitrios Agathos <spyridon(dot)dimitrios(dot)agathos(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Bug: Reading from single byte character column type may cause out of bounds memory reads.
Date: 2022-07-13 12:09:12
Message-ID: CAFM5RapGbBQm+dH=7K80HcvBvEWiV5Tm7N=NRaYURfm98YWc8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

While I was writing a test for PSQL, I faced a weird scenario. Depending on
how I build PSQL (enabling or not debug options), I saw different results
for the following query.

Steps to reproduce:
- OS: Ubuntu 20.04
- PSQL version 14.4

CREATE TABLE test (single_byte_col "char");
INSERT INTO test (single_byte_col) VALUES ('๐Ÿ€†');

In case where the following query runs in debug mode, I got this result:
SELECT left(single_byte_col, 1) l FROM test;
l
-----------
~\x7F\x7F
(1 row)

Once I turn off debug mode, I got:
SELECT left(single_byte_col, 1) l FROM test;
l
---

(1 row)

That triggered me to use Valgrind, which reported the following error:

==00:00:03:22.867 1171== VALGRINDERROR-BEGIN
==00:00:03:22.867 1171== Invalid read of size 1
==00:00:03:22.867 1171== at 0x4C41209: memmove (vg_replace_strmem.c:1382)
==00:00:03:22.868 1171== by 0xC8D8D0: text_substring
(../src/backend/utils/adt/varlena.c:1050)
==00:00:03:22.868 1171== by 0xC9AC78: text_left
(../src/backend/utils/adt/varlena.c:5614)
==00:00:03:22.868 1171== by 0x7FBB75: ExecInterpExpr
(../src/backend/executor/execExprInterp.c:749)
==00:00:03:22.868 1171== by 0x7FADA6: ExecInterpExprStillValid
(../src/backend/executor/execExprInterp.c:1824)
==00:00:03:22.868 1171== by 0x81C0AA: ExecEvalExprSwitchContext
(../src/include/executor/executor.h:339)
==00:00:03:22.868 1171== by 0x81BD96: ExecProject
(../src/include/executor/executor.h:373)
==00:00:03:22.868 1171== by 0x81B8BE: ExecScan
(../src/backend/executor/execScan.c:238)
==00:00:03:22.868 1171== by 0x8638BA: ExecSeqScan
(../src/backend/executor/nodeSeqscan.c:112)
==00:00:03:22.868 1171== by 0x8170D4: ExecProcNodeFirst
(../src/backend/executor/execProcnode.c:465)
==00:00:03:22.868 1171== by 0x80F291: ExecProcNode
(../src/include/executor/executor.h:257)
==00:00:03:22.868 1171== by 0x80A8F0: ExecutePlan
(../src/backend/executor/execMain.c:1551)
==00:00:03:22.868 1171== by 0x80A7A3: standard_ExecutorRun
(../src/backend/executor/execMain.c:361)
==00:00:03:22.868 1171== by 0x136231BC: pgss_ExecutorRun
(../contrib/pg_stat_statements/pg_stat_statements.c:1001)
==00:00:03:22.868 1171== by 0x80A506: ExecutorRun
(../src/backend/executor/execMain.c:303)
==00:00:03:22.868 1171== by 0xAD71B0: PortalRunSelect
(../src/backend/tcop/pquery.c:921)
==00:00:03:22.868 1171== by 0xAD6B75: PortalRun
(../src/backend/tcop/pquery.c:765)
==00:00:03:22.868 1171== by 0xAD1B3C: exec_simple_query
(../src/backend/tcop/postgres.c:1214)
==00:00:03:22.868 1171== by 0xAD0D7F: PostgresMain
(../src/backend/tcop/postgres.c:4496)
==00:00:03:22.868 1171== by 0x9D6C79: BackendRun
(../src/backend/postmaster/postmaster.c:4530)
==00:00:03:22.868 1171== by 0x9D61A3: BackendStartup
(../src/backend/postmaster/postmaster.c:4252)
==00:00:03:22.868 1171== by 0x9D4F56: ServerLoop
(../src/backend/postmaster/postmaster.c:1745)
==00:00:03:22.868 1171== by 0x9D23A4: PostmasterMain
(../src/backend/postmaster/postmaster.c:1417)
==00:00:03:22.868 1171== by 0x8AA0D2: main (../src/backend/main/main.c:209)
==00:00:03:22.868 1171== Address 0x3ed1f3c5 is 293 bytes inside a block of
size 8,192 alloc'd
==00:00:03:22.868 1171== at 0x4C37135: malloc (vg_replace_malloc.c:381)
==00:00:03:22.868 1171== by 0xD1C3C0: AllocSetContextCreateInternal
(../src/backend/utils/mmgr/aset.c:469)
==00:00:03:22.868 1171== by 0x821CB4: CreateExprContextInternal
(../src/backend/executor/execUtils.c:253)
==00:00:03:22.868 1171== by 0x821BE2: CreateExprContext
(../src/backend/executor/execUtils.c:303)
==00:00:03:22.868 1171== by 0x822078: ExecAssignExprContext
(../src/backend/executor/execUtils.c:482)
==00:00:03:22.868 1171== by 0x8637D6: ExecInitSeqScan
(../src/backend/executor/nodeSeqscan.c:147)
==00:00:03:22.868 1171== by 0x816B2D: ExecInitNode
(../src/backend/executor/execProcnode.c:211)
==00:00:03:22.868 1171== by 0x80A336: InitPlan
(../src/backend/executor/execMain.c:936)
==00:00:03:22.868 1171== by 0x809BE7: standard_ExecutorStart
(../src/backend/executor/execMain.c:263)
==00:00:03:22.868 1171== by 0x1362303C: pgss_ExecutorStart
(../contrib/pg_stat_statements/pg_stat_statements.c:963)
==00:00:03:22.868 1171== by 0x809881: ExecutorStart
(../src/backend/executor/execMain.c:141)
==00:00:03:22.868 1171== by 0xAD636A: PortalStart
(../src/backend/tcop/pquery.c:514)
==00:00:03:22.868 1171== by 0xAD1A31: exec_simple_query
(../src/backend/tcop/postgres.c:1175)
==00:00:03:22.868 1171== by 0xAD0D7F: PostgresMain
(../src/backend/tcop/postgres.c:4496)
==00:00:03:22.868 1171== by 0x9D6C79: BackendRun
(../src/backend/postmaster/postmaster.c:4530)
==00:00:03:22.868 1171== by 0x9D61A3: BackendStartup
(../src/backend/postmaster/postmaster.c:4252)
==00:00:03:22.868 1171== by 0x9D4F56: ServerLoop
(../src/backend/postmaster/postmaster.c:1745)
==00:00:03:22.868 1171== by 0x9D23A4: PostmasterMain
(../src/backend/postmaster/postmaster.c:1417)
==00:00:03:22.868 1171== by 0x8AA0D2: main (../src/backend/main/main.c:209)
==00:00:03:22.868 1171==
==00:00:03:22.868 1171== VALGRINDERROR-END

Then I attached a debugger to inspect the variables involved in the error
reported by Valgrind for file /src/backend/utils/adt/varlena.c:

1045 for (i = S1; i < E1; i++)
1046 p += pg_mblen(p);
1047
1048 ret = (text *) palloc(VARHDRSZ + (p - s));
1049 SET_VARSIZE(ret, VARHDRSZ + (p - s));
1050 memcpy(VARDATA(ret), s, (p - s));

The column "single_byte_col" is supposed to store only 1
byte. Nevertheless, the INSERT command implicitly casts the '๐Ÿ€†' text into
"char". This means that only the first byte of '๐Ÿ€†' ends up stored in the
column.
gdb reports that "pg_mblen(p) = 4" (line 1046), which is expected since the
pg_mblen('๐Ÿ€†') is indeed 4. Later at line 1050, the memcpy will copy 4
bytes instead of 1, hence an out of bounds memory read happens for pointer
's', which effectively copies random bytes.

--
Spiros
(ServiceNow)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2022-07-13 12:12:12 Re: Building PostgreSQL in external directory is broken?
Previous Message Dagfinn Ilmari Mannsรฅker 2022-07-13 11:59:50 Re: Building PostgreSQL in external directory is broken?