Fastest way to clone schema ~1000x

From: Emiel Mols <emiel(at)crisp(dot)nl>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Fastest way to clone schema ~1000x
Date: 2024-02-26 06:28:19
Message-ID: CAF5w505Cc_Ft6AWSNnqAfF5=2erkf4ubRtYewKhq2-XHUBYjvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

To improve our unit and end-to-end testing performance, we are looking to
optimize initialization of around 500-1000 database *schemas* from a
schema.sql file.

Background: in postgres, you cannot change databases on
existing connections, and using schemas would allow us to minimize the
amount of active connections needed from our test runners, as we can reuse
connections by updating search_path. In our current database-per-test setup
we see that there is around a ~30% (!!) total CPU overhead in
native_queued_spin_lock_slowpath (as profiled with linux perf), presumably
because of the high connection count. We run ~200 tests in parallel to
saturate a 128 core machine.

In the schema-per-test setup, however, it becomes harder to cheaply create
the schema. Before we could `create database test001 with template
testbase` to set up the database for a test, and this was reasonably fast.
Re-inserting a schema with ~500 table/index definitions across 500 test
schema's is prohibitively expensive (around 1000ms per schema insertion
means we're wasting 500 cpu-seconds, and there seems to be quite some lock
contention too). Linux perf shows that in this setup we're reducing the
native_queued_spin_lock_slowpath overhead to around 10%, but total test
time is still slower due to all schema initialization being done. Also it
feels quite elegant functions and types can be reused between tests.

Does anyone have some insights or great ideas :)? Also pretty curious to
the fundamental reason why having high/changing connection counts to
postgres results in this much (spin)lock contention (or perhaps we're doing
something wrong in either our configuration or measurements?).

An alternative we haven't explored yet is to see if we can use pgbouncer or
other connection pooler to mitigate the 30% issue (set limits so there are
only ever X connections to postgres in total, and perhaps max Y per
test/database). This does add another piece of infrastructure/complexity,
so not really prefered.

Best,

Emiel

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2024-02-26 06:36:18 Re: Fastest way to clone schema ~1000x
Previous Message jian he 2024-02-26 04:04:24 Re: Documentation diff