pgsql: Account for TOAST data while scheduling parallel dumps.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Account for TOAST data while scheduling parallel dumps.
Date: 2021-12-06 18:23:20
Message-ID: E1muIe0-0007N8-RS@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Account for TOAST data while scheduling parallel dumps.

In parallel mode, pg_dump tries to order the table-data-dumping
jobs with the largest tables first. However, it was only
consulting the pg_class.relpages value to determine table size.
This ignores TOAST data, and so we could make poor scheduling
decisions in cases where some large tables are mostly TOASTed
data while others have very little. To fix, add in the relpages
value for the TOAST table as well.

This patch also fixes a potential integer-overflow issue that
could result in poor scheduling on machines where off_t is
only 32 bits wide. Such platforms are probably extinct in the
wild, but we do still nominally support them, so repair.

Per complaint from Hans Buschmann.

Discussion: https://postgr.es/m/7d7eb6128f40401d81b3b7a898b6b4de@W2012-02.nidsa.loc

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/65aaed22a849c0763f38f81338a1cad04ffc0e2c

Modified Files
--------------
src/bin/pg_dump/pg_dump.c | 35 ++++++++++++++++++++++++++---------
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 27 insertions(+), 9 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2021-12-07 06:12:36 pgsql: Update snowball
Previous Message Peter Eisentraut 2021-12-06 12:47:03 pgsql: Fix inappropriate uses of PG_GETARG_UINT32()