Building lots of indices in parallel

From: Ron <ronljohnsonjr(at)gmail(dot)com>
To: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Building lots of indices in parallel
Date: 2021-02-12 03:31:40
Message-ID: 7fa62c6a-dfab-4d0e-17cc-e4030f83fcc9@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


We're migrating a big database (200ish tables 30 of which have bytea or xml
fields; the Oracle size is 10TB) to Postgresql 12.5 (RDS, if it matters),
and after the loads are completed, it will be time to create the secondary
indices (tables currently only have PK constraints), and they of course need
to be built as fast as possible.

The first things to do are based in Postgres itself:
ALTER TABLE foo SET (parallel_workers = 4);
SET max_parallel_maintenance_workers TO 4;
CREATE INDEX foo_ind2 ON foo (f1);

That creates one index very quickly, but only one index at a time, wasting
the other CPUs (there are 48 in the VM, and 384GB RAM).

What is the impact on locking/contention of using something like GNU
Parallel (especially when it tries to create multiple indices on the same
table in parallel)?

Is there a better way to quickly create lots of indices?

--
Angular momentum makes the world go 'round.

Browse pgsql-general by date

  From Date Subject
Next Message Matthias Apitz 2021-02-12 10:24:37 kernel.shmmax and kernel.shmall for Linux server
Previous Message Martin Ritchie 2021-02-12 02:42:35 Re: Encryption of Data Specific to a Tenant in PostgreSQL database | General Idea