Re: query multiple schemas

From: Norbert Sándor <sandor(dot)norbert(at)erinors(dot)com>
To: Dominique Devienne <ddevienne(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Steve Baldwin <steve(dot)baldwin(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: query multiple schemas
Date: 2024-04-23 09:08:26
Message-ID: 0d83d817-0ae5-4b1b-af18-63ec89701301@erinors.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<i>&gt; using a dynamic number of schemas</i><br>
<br>
Although there are historical reasons behind our "schema per tenant"
architecture, it provides very good logical separation of data, and
is very convenient that we don't need to include the tenant ID in
each query (I'm sure that it would cause lots of bugs and trouble).<br>
Besides, we use Hibernate and <a
href="https://docs.jboss.org/hibernate/orm/6.4/userguide/html_single/Hibernate_User_Guide.html#multitenacy-separate-schema">it
has great support</a> for this architecture.<br>
<br>
<i>&gt; In any case, I'm interested in what works well for you.</i><br>
<br>
I went with the dynamic solution I proposed in my original email
(which performs a UNION of the queried table from all tenants).<br>
Performance is currently not a priority in our use-cases, otherwise
I would have probably chosen a more static solution (which wouldn't
be easy to maintain continuously, I'm sure of).<br>
<br>
I further simplified its usage, so finally it is fairly comfortable
to use either directly:<br>
<blockquote><font face="monospace">select tenantId, (record).*<br>
from tenant_query(null::mytable)<br>
where (record).type=2<br>
order by tenantId, (record).name;</font><br>
</blockquote>
or by using a temporary view:<br>
<blockquote><font face="monospace">create temp view
all_tenant_mytable as<br>
select tenantId, (record).* from tenant_query(null::mytable);<br>
<br>
select *<br>
from all_tenant_mytable<br>
where type=2<br>
order by tenantId, name;</font><br>
</blockquote>
<i>&gt; In my case, the revision/version of the schema could be
different as well</i><br>
<br>
This complicates things very much, it is probably not possible to
implement a solution as comfortable as in my case.<br>
<br>
Although I think the JSON-based solutions proposed in this thread
would work in your case as well.<br>
In my current solution I also use JSON as intermediate
representation - although I <i>feel </i>it is because of my lack
of deeper knowledge of Postgresql's type system.<br>
The difference is that you would need to use JSON as the final
representation, and reference the JSON fields using <a
href="https://www.postgresql.org/docs/current/functions-json.html">Postgres's
JSON operators</a>.<br>
<br>
<i>&gt; And if/when I get back to this issue myself, I'll do the
same.</i><br>
<br>
My current solution is not much different than the one I posted in
my original question.<br>
My main difficulty was the relatively static nature of Postgresql's
type system, so this solution is a result of lots of trial-and-error
rounds :)<br>
Take a look at it, and you (and maybe others) may have
recommendations, e.g. how I could get rid of the usage of the
intermediate JSON "layer".<br>
(Although I have to admit: it is amazing that it is possible to
implement this at all ;) )<br>
<blockquote>
<div style="background-color:#ffffff;padding:0px 0px 0px 2px;">
<div
style="color:#000000;background-color:#ffffff;font-family:&quot;Liberation Mono&quot;;font-size:8pt;white-space:pre;"><p
style="margin:0;">CREATE OR REPLACE FUNCTION tenant_query_json(<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">tbl</span> <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">anyelement</span>)</p><p
style="margin:0;">RETURNS <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">setof</span> <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">json</span> AS $<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">func</span>$</p><p
style="margin:0;">declare</p><p style="margin:0;"> _select text;</p><p
style="margin:0;">begin</p><p style="margin:0;"> _select := (select</p><p
style="margin:0;"> string_agg(</p><p style="margin:0;"> format('select t.*, %L tenantId from %I.%I t', schema_name, schema_name, pg_typeof(<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">tbl</span>)),</p><p
style="margin:0;"> E'\n' || ' union all ' || E'\n')</p><p
style="margin:0;"> from (</p><p style="margin:0;"> SELECT schema_name</p><p
style="margin:0;"> FROM information_schema.schemata</p><p
style="margin:0;"> where schema_name not in ('information_schema') and schema_name not like '<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">pg</span>_%'</p><p
style="margin:0;"> ) tenants</p><p style="margin:0;"> );</p><p
style="margin:0;">
</p><p style="margin:0;"> return query execute 'select row_to_json(r) from (' || _select || ') as r';</p><p
style="margin:0;">END;</p><p style="margin:0;">$<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">func</span>$ LANGUAGE <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">plpgsql</span>;</p><p
style="margin:0;">
</p><p style="margin:0;">
</p><p style="margin:0;">CREATE OR REPLACE FUNCTION tenant_query(<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">tbl</span> <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">anyelement</span>)</p><p
style="margin:0;">RETURNS table(tenantId text, record <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">anyelement</span>) AS $<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">func</span>$</p><p
style="margin:0;">begin</p><p style="margin:0;"> return query</p><p
style="margin:0;"> select t.tenantId, t.rec</p><p
style="margin:0;"> from (</p><p style="margin:0;"> select</p><p
style="margin:0;"> <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">jr</span>-&gt;&gt;'<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">tenantid</span>' tenantId,</p><p
style="margin:0;"> json_populate_record(<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">tbl</span>, <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">jr</span>) <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">rec</span></p><p
style="margin:0;"> from tenant_query_json(<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">tbl</span>) <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">jr</span></p><p
style="margin:0;"> ) t;</p><p style="margin:0;">END;</p><p
style="margin:0;">$<span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">func</span>$ LANGUAGE <span
style="text-decoration:underline;text-decoration-color:#ff8040;text-decoration-style:wavy;">plpgsql</span>;</p><p
style="margin:0;">
</p></div>
</div>
</blockquote>
As you can see in my examples above, I use the <font
face="monospace">tenant_query()</font> function but in your case
(if your schemas are different) something similar to <font
face="monospace">tenant_query_json()</font> may work better.<br>
<br>
--<br>
Norbi<br>
<br>
<div class="moz-cite-prefix">On 2024. 04. 23. 9:33, Dominique
Devienne wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAFCRh--7Oh2CFdAhqN-GXbGt63djxHN2SYLHUwt_MKcYZZuTKg(at)mail(dot)gmail(dot)com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">On Sun, Apr 21, 2024 at 11:12 PM Tom Lane &lt;<a
href="mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us" moz-do-not-send="true"
class="moz-txt-link-freetext">tgl(at)sss(dot)pgh(dot)pa(dot)us</a>&gt;
wrote:<br>
</div>
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Steve
Baldwin &lt;<a href="mailto:steve(dot)baldwin(at)gmail(dot)com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">steve(dot)baldwin(at)gmail(dot)com</a>&gt;
writes:<br>
&gt; If the number of tenant schemas is reasonably static,
you could write a<br>
&gt; plpgsql function to create a set of UNION ALL views<br>
<br>
Another idea is to build a partitioned table<br>
</blockquote>
<div><br>
</div>
<div>Hi Norbert. I asked a [similar question][1] a while back,</div>
<div>and unfortunately didn't get any actionable input,
perhaps</div>
<div>because I already mentioned in my message the options</div>
<div>proposed here so far. Seems like people like us, using a</div>
<div>dynamic number of schemas, are outliers in database-land.</div>
<div><br>
</div>
<div>In my case, the revision/version of the schema could be</div>
<div>different as well, which would complicate the
partitioning idea.</div>
<div><br>
</div>
<div>In any case, I'm interested in what works well for you.</div>
<div>And if/when I get back to this issue myself, I'll do the
same.</div>
<div><br>
</div>
<div>Thanks, --DD</div>
<div><br>
</div>
<div>[1]: <a
href="https://postgrespro.com/list/thread-id/2673517"
moz-do-not-send="true" class="moz-txt-link-freetext">https://postgrespro.com/list/thread-id/2673517</a> </div>
</div>
</div>
</blockquote>
<br>
</body>
</html>

Attachment Content-Type Size
unknown_filename text/html 10.3 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dominique Devienne 2024-04-23 09:47:32 Re: query multiple schemas
Previous Message Dominique Devienne 2024-04-23 07:33:09 Re: query multiple schemas