Re: Improper use about DatumGetInt32

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improper use about DatumGetInt32
Date: 2020-09-23 13:58:53
Message-ID: CAExHW5s7e_4F3ANDTVZ8ESMojCnFUrpanC9srTrui8HMb895Zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 23, 2020 at 1:41 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Mon, Sep 21, 2020 at 3:53 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> I think we mostly use it for the few places where we currently expose
> >> data as a signed integer on the SQL level, but internally actually treat
> >> it as a unsigned data.
>
> > So why is the right solution to that not DatumGetInt32() + a cast to uint32?
>
> You're ignoring the xid use-case, for which DatumGetUInt32 actually is
> the right thing.

There is DatumGetTransactionId() which should be used instead.
That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's
there but only defined in xid.c. So pg_xact_commit_timestamp(),
pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use
PG_GETARG_UNIT32. IMO those should be changed to use
PG_GETARG_TRANSACTIONID. That would require moving
PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where
other PG_GETARG_* are.

> I tend to agree though that if the SQL argument is
> of a signed type, the least API-abusing answer is a signed DatumGetXXX
> macro followed by whatever cast you need.
>

I looked for some uses of PG_GETARG_UNIT32() which is the counterpart
of DatumGetUint32(). Found some buggy usages apart from the ones which
can be converted to PG_GETARG_TRANSACTIONID listed above.
normal_rand() for example returns a huge number of rows and takes
forever if we pass a negative first argument to it. Someone could
misuse that for a DOS attack or it could be just an accident that they
pass a negative value to that function and the query takes forever.
explain analyze select count(*) from normal_rand(-1000000, 1.0, 1.0);
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=12.50..12.51 rows=1 width=8) (actual
time=2077574.718..2077574.719 rows=1 loops=1)
-> Function Scan on normal_rand (cost=0.00..10.00 rows=1000
width=0) (actual time=1005176.149..1729994.366 rows=4293967296
loops=1)
Planning Time: 0.346 ms
Execution Time: 2079034.835 ms

get_raw_page() also does similar thing but the effect is not as dangerous
SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1;
ERROR: block number 4294967295 is out of range for relation "test1"
Similarly for bt_page_stats() and bt_page_items()

PFA patches to correct those.

There's Oracle compatible chr() which also uses PG_GETARG_UINT32() but
it's (accidentally?) reporting the negative inputs correctly because
it filters out very large values and reports those using %d. It's
arguable whether we should change that, so I have left it untouched.
But I think we should change that as well and get rid of
PG_GETARG_UNIT32 altogether. This will prevent any future misuse.

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
0001-Handle-negative-number-of-tuples-passed-to-normal_ra.patch application/x-patch 2.9 KB
0002-Negative-block-number-passed-to-functions-in-pageins.patch application/x-patch 7.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-09-23 14:39:10 Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)
Previous Message Stephen Frost 2020-09-23 13:46:50 Re: new heapcheck contrib module