Re: Non-text mode for pg_dumpall

From: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Non-text mode for pg_dumpall
Date: 2025-01-01 20:35:13
Message-ID: CAKYtNArH1=1gf=1t-OErx5G0y2P1nog1atQ4HMyxY2S5ocdnSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here, I am attaching an updated patch. I fixed some bugs of v01 patch and
did some code cleanup also.

TODO WIP 1: after excluding databases, we have paths of all the databases
that are needed to
restore so we can launch parallel workers for each database. I am studying
for this part.

TODO WIP 2: exclude-database=NAME, for pg_restore, I am using NAME as of
now, I will try to make it PATTERN. PATTERN
should be matched from map.dat file.

Please have a look over the patch and let me know feedback.

On Tue, 31 Dec 2024 at 23:53, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
wrote:

> Hi all,
> With the help of Andrew and Dilip Kumar, I made a poc patch to dump all
> the databases in archive format and then restore them using pg_restore.
>
> Brief about the patch:
> new option to pg_dumpall:
> -F, --format=d|p (directory|plain) output file format (directory, plain
> text (default))
>
> Ex: ./pg_dumpall --format=directory --file=dumpDirName
>
> dumps are as:
> global.dat ::: global sql commands in simple plain format
> map.dat. ::: dboid dbname ---entries for all databases in simple text
> form
> databases. :::
> subdir dboid1 -> toc.dat and data files in archive format
> subdir dboid2. -> toc.dat and data files in archive format
> etc
> ---------------------------------------------------------------------------
>
> new options to pg_restore:
> -g, --globals-only restore only global objects, no databases
> --exclude-database=PATTERN exclude databases whose name matches PATTERN
>
> When we give -g/--globals-only option, then only restore globals, no db
> restoring.
>
> *Design*:
> When --format=directory is specified and there is no toc.dat file in the
> main directory, then check
> for global.dat and map.dat to restore all databases. If both files exist
> in a directory,
> then first restore all globals from global.dat and then restore all
> databases one by one
> from map.dat list.
> While restoring, skip the databases that are given with exclude-database.
>
> ---------------------------------------------------------------------------
> NOTE:
> if needed, restore single db by particular subdir
>
> Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5
> -- here, 5 is the dboid of postgres db
> -- to get dboid, refer dbname in map.file
> --------------------------------------------------------------------------
>
> Please let me know feedback for the attached patch.
>
> On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
>> On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>>> > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <
>>> nathandbossart(at)gmail(dot)com>
>>> > wrote:
>>> >> Is there a particular advantage to that approach as opposed to just
>>> using
>>> >> "directory" mode for everything?
>>>
>>> > A gazillion files to deal with? Much easier to work with individual
>>> custom
>>> > files if you're moving databases around and things like that.
>>> > Much easier to monitor eg sizes/dates if you're using it for backups.
>>>
>>> You can always tar up the directory tree after-the-fact if you want
>>> one file. Sure, that step's not parallelized, but I think we'd need
>>> some non-parallelized copying to create such a file anyway.
>>>
>>
>> That would require double the disk space.
>>
>> But you can also just run pg_dump manually on each database and a
>> pg_dumpall -g like people are doing today -- I thought this whole thing was
>> about making it more convenient :)
>>
>> --
>> Magnus Hagander
>> Me: https://www.hagander.net/ <http://www.hagander.net/>
>> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
>>
>
>
> --
> Thanks and Regards
> Mahendra Singh Thalor
> EnterpriseDB: http://www.enterprisedb.com
>

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v02_poc_pg_dumpall_with_directory_2nd_jan.patch application/octet-stream 35.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2025-01-01 22:12:36 Re: POC: make mxidoff 64 bits
Previous Message Bruce Momjian 2025-01-01 18:32:18 Re: Backport of CVE-2024-10978 fix to older pgsql versions (11, 9.6, and 9.4)