Re: Unbounded %s in sscanf

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Unbounded %s in sscanf
Date: 2021-10-15 11:44:12
Message-ID: AC41989D-65D5-4621-9A29-1D63C3DEC240@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 30 Jul 2021, at 18:03, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Daniel Gustafsson <daniel(at)yesql(dot)se> writes:
>> I took another look at this today, and propose to push the attached. The
>> pg_dump fix goes all the way back to 9.6 whereas the pg_basebackup fix is from
>> 11 and onwards. The adjacent shadowed variable bug in pg_dump is also present
>> since 9.6.
>> Thoughts?

Reviving an old thread that had gotten lost in the mists of the INBOX.

> Generally +1, though I wonder if it'd be prudent to deal with the
> shadowed-variable bug by renaming *both* variables. "fname" is
> clearly too generic in a function that deals with multiple file
> names.

Good point, done in the attached.

> Another thing that is nibbling at the back of my mind is that one
> reason we started to use src/port/snprintf.c all the time is that
> glibc's *printf functions behave in a very unfriendly fashion when
> asked to print text that they think is invalidly encoded, but only
> if the format involves an explicit field width spec. I wonder if
> we're opening ourselves to similar problems if we start to use
> field widths with *scanf. In principle, I think the input text
> always ought to be ASCII in these cases, so that there's no hazard.
> But is there an interesting security aspect here? That is, if someone
> can inject a maliciously-crafted file containing non-ASCII data, what
> kind of misbehavior could ensue? It might be that sscanf would just
> report failure and we'd give up, which would be fine. But if a
> stack overrun could be triggered that way, it'd not be fine.

sscanf won't fail in that case. For multibyte input, %xs will simply stop
after x bytes, ignoring torn characters with a (highly likely) incorrect value
in the result variable. Using %xls (or %xS) would however count x towards the
number of multibytes, which if stored in a normal char* variable could result
in an overflow.

With a width specifier this isn't really a vector. If an attacker can inject
multibyte X which after a torn read results in z being parsed and acted upon,
she could also by definition inject z to begin with.

Without a width specifier, If a malicious actor manages to inject multibyte (or
just too many bytes), it could however lead to a stack overflow as sscanf will
keep reading until a whitespace byte.

I propose to apply the attached all the way down (with the basebackup hunk from
11), or down to 10 if we want to be conservative with the final 9.6 re ancient
bugs that haven't seen complaints.

--
Daniel Gustafsson https://vmware.com/

Attachment Content-Type Size
v5-0001-Fix-sscanf-limits-in-pg_basebackup-and-pg_dump.patch application/octet-stream 2.2 KB
v5-0002-Fix-bug-in-TOC-file-error-message-printing.patch application/octet-stream 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Ladhe 2021-10-15 11:54:15 Re: refactoring basebackup.c
Previous Message houzj.fnst@fujitsu.com 2021-10-15 11:22:50 Data is copied twice when specifying both child and parent table in publication