How to optimize PostgreSQL Row Security Policies that involve related tables?

From: David Ventimiglia <davidaventimiglia(at)hasura(dot)io>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: How to optimize PostgreSQL Row Security Policies that involve related tables?
Date: 2023-08-14 16:36:41
Message-ID: CADE7j6hzNEsdrfbUxZUKEZXz-B0Vz1Qy-mFo4hB+QXWsgAVv9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello!

What are some good ways to create Row Security Policies that restrict rows
in one table using a filter on a related table, and get good performance?
It's difficult to describe but let me try to explain.

I'm using the venerable old Chinook database
<https://github.com/cwoodruff/ChinookDatabase>. Among its tables are
Artist, Album, and Track. Track has a foreign key constraint on
Track.AlbumId, and Album has a foreign key constraint on Album.ArtistId.
Albums belong to Artists while Tracks belong to Albums. I want to have two
policies:

1. Restrict access to Album based on Album.ArtistId =
current_setting('rls.artistId'), supposing that 'rls.artistId' will be set
to the current artist's valid artistId.
2. Restrict access to Track such that the current artist only sees
tracks on their own albums and not on anybody else's albums.

One challenge is that the Track table doesn't actually have ArtistId
(naturally). However, if we know there's a valid policy on Album, then we
should be able to leverage that in a policy on Track. The trouble is, I
can't do a join in the Track policy (or at least, I haven't figured out how
to do a join), so I have to resort to a "exists" check. Specifically, the
two policies look like this:

create policy artist_rls_policy ON "Album" for select to public using
("ArtistId"=(current_setting('rls.artistID'))::integer);

create policy album_rls_policy on "Track" for select to public
using (
exists (select * from "Album" where "Album"."AlbumId" = "AlbumId") --
THIS IS THE TROUBLESOME POLICY CHECK HERE
);

But, the presence of the "exists" check leads to a suboptimal plan:

explain analyze select * from "Track";

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on "Track" (cost=8.17..203181.49 rows=5001816 width=56) (actual
time=4.631..699.831 rows=10003504 loops=1)
Filter: $0
InitPlan 1 (returns $0)
-> Index Scan using "IFK_AlbumArtistId" on "Album" (cost=0.15..8.17
rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=1)
Index Cond: ("ArtistId" =
(current_setting('rls.artistID'::text))::integer)
Filter: ("AlbumId" IS NOT NULL)
Planning Time: 0.209 ms
JIT:
Functions: 7
Options: Inlining false, Optimization false, Expressions true, Deforming
true
Timing: Generation 0.673 ms, Inlining 0.000 ms, Optimization 0.391 ms,
Emission 3.793 ms, Total 4.857 ms
Execution Time: 876.035 ms
(12 rows)

In my query, sure, I can explicitly join Track to Album to get a better
plan:

explain analyze select * from "Track" natural join "Album";

QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=8.75..37450.90 rows=14414 width=83) (actual
time=0.184..0.314 rows=14 loops=1)
InitPlan 1 (returns $0)
-> Index Scan using "IFK_AlbumArtistId" on "Album" "Album_1"
(cost=0.15..8.17 rows=1 width=0) (actual time=0.033..0.034 rows=1 loops=1)
Index Cond: ("ArtistId" =
(current_setting('rls.artistID'::text))::integer)
Filter: ("AlbumId" IS NOT NULL)
-> Index Scan using "PK_Album" on "Album" (cost=0.15..22.82 rows=1
width=31) (actual time=0.136..0.252 rows=1 loops=1)
Filter: ("ArtistId" =
(current_setting('rls.artistID'::text))::integer)
Rows Removed by Filter: 346
-> Index Scan using "IFK_TrackAlbumId" on "Track" (cost=0.43..32418.09
rows=500182 width=56) (actual time=0.044..0.051 rows=14 loops=1)
Index Cond: ("AlbumId" = "Album"."AlbumId")
Filter: $0
Planning Time: 0.509 ms
Execution Time: 0.364 ms
(13 rows)

But, that's redundant since there's enough information to process the query
correctly (albeit more slowly) without the join. The question is, how if
at all can I create the policies to have both the desired composability and
also good performance? I hope I've explained this well enough. I asked
this question a while back on StackOverflow
<https://stackoverflow.com/questions/76525562/how-to-optimize-postgresql-row-security-policies-that-involve-related-tables>
but
got no interest. I also have sample code in a public GitHub repository here
<https://github.com/dventimihasura/hasura-projects/tree/master/rls-optimization-1>
that
illustrates the setup. Any advice would be greatly appreciated. Thank you!

Kind regards,
David A. Ventimiglia

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rob Sargent 2023-08-14 17:25:27 Re: Fatal Error : Invalid Memory alloc request size 1236252631
Previous Message Ron 2023-08-14 15:44:35 Re: Query plan regression between CTE and views