Re: Non-text mode for pg_dumpall

From: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Non-text mode for pg_dumpall
Date: 2024-12-31 18:23:40
Message-ID: CAKYtNAryL7Y-JEasiDCts+iiMRufNi74HpPvf=T4b9a1NSa_PQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,
With the help of Andrew and Dilip Kumar, I made a poc patch to dump all the
databases in archive format and then restore them using pg_restore.

Brief about the patch:
new option to pg_dumpall:
-F, --format=d|p (directory|plain) output file format (directory, plain
text (default))

Ex: ./pg_dumpall --format=directory --file=dumpDirName

dumps are as:
global.dat ::: global sql commands in simple plain format
map.dat. ::: dboid dbname ---entries for all databases in simple text form
databases. :::
subdir dboid1 -> toc.dat and data files in archive format
subdir dboid2. -> toc.dat and data files in archive format
etc
---------------------------------------------------------------------------

new options to pg_restore:
-g, --globals-only restore only global objects, no databases
--exclude-database=PATTERN exclude databases whose name matches PATTERN

When we give -g/--globals-only option, then only restore globals, no db
restoring.

*Design*:
When --format=directory is specified and there is no toc.dat file in the
main directory, then check
for global.dat and map.dat to restore all databases. If both files exist in
a directory,
then first restore all globals from global.dat and then restore all
databases one by one
from map.dat list.
While restoring, skip the databases that are given with exclude-database.

---------------------------------------------------------------------------
NOTE:
if needed, restore single db by particular subdir

Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5
-- here, 5 is the dboid of postgres db
-- to get dboid, refer dbname in map.file
--------------------------------------------------------------------------

Please let me know feedback for the attached patch.

On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <magnus(at)hagander(dot)net> wrote:

> On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <
>> nathandbossart(at)gmail(dot)com>
>> > wrote:
>> >> Is there a particular advantage to that approach as opposed to just
>> using
>> >> "directory" mode for everything?
>>
>> > A gazillion files to deal with? Much easier to work with individual
>> custom
>> > files if you're moving databases around and things like that.
>> > Much easier to monitor eg sizes/dates if you're using it for backups.
>>
>> You can always tar up the directory tree after-the-fact if you want
>> one file. Sure, that step's not parallelized, but I think we'd need
>> some non-parallelized copying to create such a file anyway.
>>
>
> That would require double the disk space.
>
> But you can also just run pg_dump manually on each database and a
> pg_dumpall -g like people are doing today -- I thought this whole thing was
> about making it more convenient :)
>
> --
> Magnus Hagander
> Me: https://www.hagander.net/ <http://www.hagander.net/>
> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
>

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v01_poc_pg_dumpall_with_directory_31dec.patch application/octet-stream 33.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2024-12-31 19:46:37 Re: Backport of CVE-2024-10978 fix to older pgsql versions (11, 9.6, and 9.4)
Previous Message Larry Rosenman 2024-12-31 18:22:14 Strange issue with NFS mounted PGDATA on ugreen NAS