Re: Quesion about querying distributed databases

From: me nefcanto <sn(dot)1361(at)gmail(dot)com>
To: Rob Sargent <robjsargent(at)gmail(dot)com>
Cc: Igor Korot <ikorot01(at)gmail(dot)com>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Quesion about querying distributed databases
Date: 2025-03-06 07:44:25
Message-ID: CAEHBEODHVs2VLkT37iVJ4-QSAnk8x-GuK8Fmsxk=nP2+EycL5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I appreciate your time guys. Thank you very much.

> Having 1 table per database per server is too ugly.

Our databases are not one table per database. They are mapped to DDD's
bounded contexts and usually by one table per domain entity.
For example, we have these databases:

- Contacts
- Courses
- Seo
- Payment
- Forms
- Geo
- Sales
- Media
- Taxonomy
- ...

These are the tables we have in the Contacts database:

- Addresses
- AddressTypes
- Attributes
- BankAccounts
- ContactContents
- Contacts
- Emails
- Genders
- JobTitles
- JuridicalPersons
- NaturalPersonRelations
- NaturalPersons
- Persons
- Phones
- PhoneTypes
- Relations
- RelationTypes
- SocialNetworks
- SocialProfiles
- Titles

And, these are the tables we have in the Geo database:

- AdministrativeDivisions
- AdministrativeDivisionTypes
- Cities
- CityDivisions
- Countries
- Locations
- SpatialDataItems
- TelephonePrefixes
- TimeZones

But we also do have databases that only have one table in them. The number
of tables is not our criteria to break them. The business semantics is our
criteria.

> Cross-database on MSSQL is identical to the cross schema on Postgres.

Cross-database query in SQL Server is not equivalent to cross-schema
queries in Postgres. Because SQL Server also has the concept of schemas. In
other words, both SQL Server and Postgres let you create databases, create
schemas inside them, and create tables inside schemas. So SQL Server's
cross-schema query equals Postgres's cross-schema query.

> If you truly need cross server support (versus say beefier hardware) how
did you come to choose postgres?

We chose Postgres for these reasons that we did R&D about:

- Native array per column support
- Not having multiple storage engines like MariaDB to be confused about
- Supporting expressions in unique constraints
- It's usually considered one of the best when it comes to performance,
especially in GIS we intend to develop more upon
- As it claims on its website, it's the most advanced open-source
database engine (but to be honest, we saw many serious drawbacks to that
statement)

But here's the deal. We don't have one project only. We don't need
*cross-server
queries* for all of our projects. But we tend to keep our architecture the
same across projects as much as we can. We chose Postgres because we had
experience with SQL Server and MariaDB and assumed that cross-database
query on the same server is something natural. Both of them support that.
And both are very performant on that. On MariaDB all you have to do is to
use `db_name.table_name` and on SQL Server all you have to do is to use
`database_name.schema_name.table_name`. So we thought, for projects that do
not need more than one server, we keep databases on the same server. When
it needed more resources, we start by taking heavy databases onto their own
servers, and we start implementing table partitinong on them.

But we have experienced some amazing improvements too in our initial tests.
For example, creating all databases and tables and database objects on
MariaDB takes more than 400 seconds, while the same took 80 seconds on
Postgres. So amazing performance on DDL.
Also, 1 million records in bulk insertion take almost one-sixth to
on-fourth of the time on MariaDB. These are valuable numbers. They warmed
our hearts to keep digging as much as we can to see if we can perform this
migration.

Regards
Saeed

On Thu, Mar 6, 2025 at 7:14 AM Rob Sargent <robjsargent(at)gmail(dot)com> wrote:

>
>
> On Mar 5, 2025, at 8:03 PM, Igor Korot jnit worked great for SQL Server.
> If you're small, we host them all on one server. If you get bigger, we can
> put heavy databases on separate machines.
>
>
>> However, I don't have experience working with other types of database
>> scaling. I have used table partitioning, but I have never used sharding.
>>
>> Anyway, that's why I asked you guys. However, encouraging me to go back
>> to monolith without giving solutions on how to scale, is not helping. To be
>> honest, I'm somehow disappointed by how the most advanced open source
>> database does not support cross-database querying just like how SQL Server
>> does. But if it doesn't, it doesn't. Our team should either drop it as a
>> choice or find a way (by asking the experts who built it or use it) how to
>> design based on its features. That's why I'm asking.
>>
>>
> Cross-database on MSSQL is identical to cross schema on postgres. If you
> truly need cross server support (versus say beefier hardware) how did you
> come to choose postgres? The numbers you present are impressive but not
> unheard of on this list.
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2025-03-06 08:04:31 Re: Quesion about querying distributed databases
Previous Message Abraham, Danny 2025-03-06 07:39:17 Asking for OK for a nasty trick to resolve PG CVE-2025-1094 i