Re: How to optimize PostgreSQL Row Security Policies that involve related tables?

From: David Ventimiglia <davidaventimiglia(at)hasura(dot)io>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: How to optimize PostgreSQL Row Security Policies that involve related tables?
Date: 2023-08-14 17:55:53
Message-ID: CADE7j6iA_Zrdovs-FK5wPthu=gWSTKK+bxj=SzkZzo2KDdszrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Sorry, I had a typo in my Row Policy. Rather, it should look like this:

create policy album_rls_policy on "Track" for select to public
using (
exists (select * from "Album" where "Album"."AlbumId" =
"Track"."AlbumId")
);

On Mon, Aug 14, 2023 at 9:36 AM David Ventimiglia <
davidaventimiglia(at)hasura(dot)io> wrote:

> Hello!
>
> What are some good ways to create Row Security Policies that restrict rows
> in one table using a filter on a related table, and get good performance?
> It's difficult to describe but let me try to explain.
>
> I'm using the venerable old Chinook database
> <https://github.com/cwoodruff/ChinookDatabase>. Among its tables are
> Artist, Album, and Track. Track has a foreign key constraint on
> Track.AlbumId, and Album has a foreign key constraint on Album.ArtistId.
> Albums belong to Artists while Tracks belong to Albums. I want to have two
> policies:
>
> 1. Restrict access to Album based on Album.ArtistId =
> current_setting('rls.artistId'), supposing that 'rls.artistId' will be set
> to the current artist's valid artistId.
> 2. Restrict access to Track such that the current artist only sees
> tracks on their own albums and not on anybody else's albums.
>
> One challenge is that the Track table doesn't actually have ArtistId
> (naturally). However, if we know there's a valid policy on Album, then we
> should be able to leverage that in a policy on Track. The trouble is, I
> can't do a join in the Track policy (or at least, I haven't figured out how
> to do a join), so I have to resort to a "exists" check. Specifically, the
> two policies look like this:
>
> create policy artist_rls_policy ON "Album" for select to public using
> ("ArtistId"=(current_setting('rls.artistID'))::integer);
>
> create policy album_rls_policy on "Track" for select to public
> using (
> exists (select * from "Album" where "Album"."AlbumId" = "AlbumId") --
> THIS IS THE TROUBLESOME POLICY CHECK HERE
> );
>
> But, the presence of the "exists" check leads to a suboptimal plan:
>
> explain analyze select * from "Track";
>
> QUERY PLAN
>
>
> -------------------------------------------------------------------------------------------------------------------------------------
> Seq Scan on "Track" (cost=8.17..203181.49 rows=5001816 width=56) (actual
> time=4.631..699.831 rows=10003504 loops=1)
> Filter: $0
> InitPlan 1 (returns $0)
> -> Index Scan using "IFK_AlbumArtistId" on "Album" (cost=0.15..8.17
> rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=1)
> Index Cond: ("ArtistId" =
> (current_setting('rls.artistID'::text))::integer)
> Filter: ("AlbumId" IS NOT NULL)
> Planning Time: 0.209 ms
> JIT:
> Functions: 7
> Options: Inlining false, Optimization false, Expressions true,
> Deforming true
> Timing: Generation 0.673 ms, Inlining 0.000 ms, Optimization 0.391 ms,
> Emission 3.793 ms, Total 4.857 ms
> Execution Time: 876.035 ms
> (12 rows)
>
> In my query, sure, I can explicitly join Track to Album to get a better
> plan:
>
> explain analyze select * from "Track" natural join "Album";
>
> QUERY
> PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=8.75..37450.90 rows=14414 width=83) (actual
> time=0.184..0.314 rows=14 loops=1)
> InitPlan 1 (returns $0)
> -> Index Scan using "IFK_AlbumArtistId" on "Album" "Album_1"
> (cost=0.15..8.17 rows=1 width=0) (actual time=0.033..0.034 rows=1 loops=1)
> Index Cond: ("ArtistId" =
> (current_setting('rls.artistID'::text))::integer)
> Filter: ("AlbumId" IS NOT NULL)
> -> Index Scan using "PK_Album" on "Album" (cost=0.15..22.82 rows=1
> width=31) (actual time=0.136..0.252 rows=1 loops=1)
> Filter: ("ArtistId" =
> (current_setting('rls.artistID'::text))::integer)
> Rows Removed by Filter: 346
> -> Index Scan using "IFK_TrackAlbumId" on "Track"
> (cost=0.43..32418.09 rows=500182 width=56) (actual time=0.044..0.051
> rows=14 loops=1)
> Index Cond: ("AlbumId" = "Album"."AlbumId")
> Filter: $0
> Planning Time: 0.509 ms
> Execution Time: 0.364 ms
> (13 rows)
>
> But, that's redundant since there's enough information to process the
> query correctly (albeit more slowly) without the join. The question is,
> how if at all can I create the policies to have both the desired
> composability and also good performance? I hope I've explained this well
> enough. I asked this question a while back on StackOverflow
> <https://stackoverflow.com/questions/76525562/how-to-optimize-postgresql-row-security-policies-that-involve-related-tables> but
> got no interest. I also have sample code in a public GitHub repository
> here
> <https://github.com/dventimihasura/hasura-projects/tree/master/rls-optimization-1> that
> illustrates the setup. Any advice would be greatly appreciated. Thank you!
>
> Kind regards,
> David A. Ventimiglia
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2023-08-14 19:14:18 Re: Fatal Error : Invalid Memory alloc request size 1236252631
Previous Message Sai Teja 2023-08-14 17:31:19 Re: Fatal Error : Invalid Memory alloc request size 1236252631