Quick Links

Re:Improving RLS planning

From:	Zhao Rui <875941708(at)qq(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>
Cc:	pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re:Improving RLS planning
Date:	2022-07-08 02:51:16
Message-ID:	tencent_0FF6973E90D7CD46CDDA81F41FBE8A129C06@qq.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

This patch causes wrong index scan plan with RLS. Maybe function restriction_is_securely_promotable is too strict?

You can reproduce in this way:

create table abc (a integer, b text);
insert into abc select (random()*(10^4))::integer, (random()*(10^4))::text from generate_series(1,100000);

create index on abc(a, lower(b));

ALTER TABLE abc enable ROW LEVEL SECURITY;
ALTER TABLE abc FORCE  ROW LEVEL SECURITY;

CREATE POLICY abc_id_iso_ply on abc to CURRENT_USER USING (a = (current_setting('app.a'::text))::int);

# for bypass user, index scan works fine
explain analyse select * from abc where a=1 and lower(b)='1234';
     Index Scan using abc_a_lower_idx on abc

         Index Cond: ((a = 1) AND (lower(b) = '1234'::text))

# for RLS user, index scan can only use column a, and filter by lower(b)
set app.a=1;
explain analyse select * from abc where a=1 and lower(b)='1234';
     Index Scan using abc_a_lower_idx on abc

         Index Cond: (a = 1)
         Filter: (lower(b) = '1234'::text)

This only occurs when using non-leak-proof functional index. Everything works fine in following way:
create index on abc(a, b);
explain analyse select * from abc where a=1 and b='1234';

I think crucial function is restriction_is_securely_promotable. Maybe it is too strict to reject normal clause match.
Could you please recheck RLS with functional index?

regards, 
Mark Zhao

------------------ Original ------------------
From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>;
Date: Wed, Oct 26, 2016 05:58 AM
To: "pgsql-hackers"<pgsql-hackers(at)postgreSQL(dot)org>;

Subject: Improving RLS planning

Currently, we don't produce very good plans when row-level security
is enabled.  An example is that, given

create table t1 (pk1 int primary key, label text);
create table t2 (pk2 int primary key, fk int references t1);

then for

select * from t1, t2 where pk1 = fk and pk2 = 42;

you would ordinarily get a cheap plan like

 Nested Loop
   ->  Index Scan using t2_pkey on t2
         Index Cond: (pk2 = 42)
   ->  Index Scan using t1_pkey on t1
         Index Cond: (pk1 = t2.fk)

But stick an RLS policy on t1, and that degrades to a seqscan, eg

 Nested Loop
   Join Filter: (t1.pk1 = t2.fk)
   ->  Index Scan using t2_pkey on t2
         Index Cond: (pk2 = 42)
   ->  Seq Scan on t1
         Filter: (label = 'public'::text)

The reason for this is that we implement RLS by turning the reference
to t1 into a sub-SELECT, and the planner's recursive invocation of
subquery_planner produces only a seqscan path for t1, there not being
any reason visible in the subquery for it to do differently.

I have been thinking about improving this by allowing subquery_planner
to generate parameterized paths; but the more I think about that the
less satisfied I am with it.  It will be quite expensive and probably
will still fail to find desirable plans in many cases.  (I've not given
up on parameterized subquery paths altogether --- I just feel it'd be a
brute-force and not very effective way of dealing with RLS.)

The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries.  Instead, we can label individual
qual clauses with security precedence markings.  Concretely, suppose we
add an "int security_level" field to struct RestrictInfo.  The semantics
of this would be that a qual with a lower security_level value must be
evaluated before a qual with a higher security_level value, unless the
latter qual is leakproof.  (It would likely also behoove us to add a
"leakproof" bool field to struct RestrictInfo, to avoid duplicate
leakproof-ness checks on quals.  But that's just an optimization.)

In the initial implementation, quals coming from a RangeTblEntry's
securityQuals field would have security_level 0, quals coming from
anywhere else would have security_level 1; except that if we know
there are no security quals anywhere (ie not Query->hasRowSecurity),
we could give all quals security_level 0.  (I think this exception
may be worth making because there's no need to test leakproofness
for a qual with security level 0; it could never be a candidate
for security delay anyway.)

Having done that much, I think all we need in order to get rid of
RLS subqueries, and just stick RLS quals into their relation's
baserestrictinfo list, are two rules:

1. When selecting potential indexquals, a RestrictInfo can be considered
for indexqual use only if it is leakproof or has security_level <= the
minimum among the table's baserestrictinfo clauses.

2. In order_qual_clauses, sort first by security_level and second by cost.

This would already be enough of a win to be worth doing.  I think though
that this mechanism can be extended to also allow getting rid of the
restriction that security-barrier views can't be flattened.  The idea
would be to make sure that quals coming from above the SB view are given
higher security_level values than quals within the SB view.  We'd need
some extra mechanism to make that possible --- perhaps an additional kind
of node within jointree nests to show where there had been a
security-barrier boundary, and then some smarts in distribute_qual_to_rels
to prevent pushing upper quals down past a lower qual of strictly lesser
security level.  But that can come later.  (We do not need such smarts
to fix the RLS problem, because in the initial version, quals with lower
security level than another qual could only exist at the baserel level.)

In short, I'm proposing to throw away the entire existing implementation
for planning of RLS and SB views, and start over.

There are some corner cases I've not entirely worked out, in particular
what security_level to assign to quals generated from EquivalenceClasses.
A safe but not optimal answer would be to assign them the maximum
security_level of any source clause of the EC.  Maybe it's not worth
working harder than that, because most equality operators are leakproof
anyway, so that it wouldn't matter what level we assigned them.

Before I start implementing this, can anyone see a fatal flaw in the
design?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kyotaro Horiguchi	2022-07-08 02:52:45	Re: DropRelFileLocatorBuffers
Previous Message	Kyotaro Horiguchi	2022-07-08 02:39:25	Re: [PATCH] fix wait_event of pg_stat_activity in case of high amount of connections