From: | Dominique Devienne <ddevienne(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Ways to aggregate table info from variable number of schemas |
Date: | 2023-11-27 18:16:01 |
Message-ID: | CAFCRh-8k-8eAq_YQ1_Njqt8eO4Sej65+=DAHeYVRUbg=4Sz-bw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Our PostgreSQL-based system uses a small number of fixed schemas,
and an arbitrary number of per-project schemas (one schema per project).
This will is given, sorry, cannot be changed. Really.
The fixed schemas contain metadata about the projects (including which
schema a given project uses).
My problem is that some of the important metadata is not in the fixed
(administrative) schemas,
but in the per-project schemas. Which means you can't do a declarative SQL
query for those
metadata across projects, since you can't do static / non-dynamic SQL
across schemas.
I've brainstormed the different ways I could easily query those
"spread-out" metadata,
and I'm trying my chance at community input. Here's what I've considered so
far:
1) Mega union-all view, updated (programmatically) when schemas are
added/removed
2) Concrete tables in one of the fixed schemas, the project-schemas push
data to, via triggers
3) Partitioned table. No experience with it, but from what I've read, a
manually created partitioned table with manually added/removed partition
seems viable. The caveat being that I don't have a (virtual) column right
now in the per-project schema tables that would identify the rows across
schemas/projects. But that could be added, if this is viable.
4) SQL functions taking a project-id, and doing dynamic SQL inside against
a given schema, to fetch the metadata of interest for that SQL function.
Most of a PULL approach, which allows a kind of hidden dynamic JOIN when
scanning the projects from the fixed schema table(s). This might be the
simplest?
Maybe there are other options too?
Basically I need to do a little data-warehouse of sort, no?
Insights, inputs, would be greatly appreciated. Thanks, --DD
PS: A recent comment by Tom seemed to indicate that union-all views are not
well optimized, and don't trim branches of the union-all based on the where
clause. So would not scale well (one internal install has 3000 projects)
PPS: I like the idea of the partitioned table, since could do efficient
partition elimination, but this is completely new territory for me, and I
fear the complexity might be high. Thoughts?
From | Date | Subject | |
---|---|---|---|
Next Message | Christophe Pettus | 2023-11-27 18:21:08 | Re: Ways to aggregate table info from variable number of schemas |
Previous Message | Dominique Devienne | 2023-11-27 16:14:46 | Re: Emitting JSON to file using COPY TO |