Quick Links

Re: Non-text mode for pg_dumpall

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
Cc:	jian he <jian(dot)universality(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Srinath Reddy <srinath2133(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Non-text mode for pg_dumpall
Date:	2025-03-30 22:05:57
Message-ID:	b0abf11e-92a3-429d-9c90-0fe2523c5e03@dunslane.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2025-03-30 Su 12:50 PM, Andrew Dunstan wrote:
>
> On 2025-03-29 Sa 1:17 AM, Mahendra Singh Thalor wrote:
>> On Sat, 29 Mar 2025 at 03:50, Andrew Dunstan <andrew(at)dunslane(dot)net>
>> wrote:
>>>
>>> On 2025-03-27 Th 5:15 PM, Andrew Dunstan wrote:
>>>> On 2025-03-19 We 2:41 AM, Mahendra Singh Thalor wrote:
>>>>> On Wed, 12 Mar 2025 at 21:18, Andrew Dunstan <andrew(at)dunslane(dot)net>
>>>>> wrote:
>>>>>> On 2025-03-12 We 3:03 AM, jian he wrote:
>>>>>>> On Wed, Mar 12, 2025 at 1:06 AM Álvaro Herrera
>>>>>>> <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> On 2025-Mar-11, Mahendra Singh Thalor wrote:
>>>>>>>>
>>>>>>>>> In map.dat file, I tried to fix this issue by adding number of
>>>>>>>>> characters
>>>>>>>>> in dbname but as per code comments, as of now, we are not
>>>>>>>>> supporting \n\r
>>>>>>>>> in dbnames so i removed handling.
>>>>>>>>> I will do some more study to fix this issue.
>>>>>>>> Yeah, I think this is saying that you should not consider the
>>>>>>>> contents
>>>>>>>> of map.dat as a shell string. After all, you're not going to
>>>>>>>> _execute_
>>>>>>>> that file via the shell.
>>>>>>>>
>>>>>>>> Maybe for map.dat you need to escape such characters somehow,
>>>>>>>> so that
>>>>>>>> they don't appear as literal newlines/carriage returns.
>>>>>>>>
>>>>>>> I am confused.
>>>>>>> currently pg_dumpall plain format will abort when encountering
>>>>>>> dbname
>>>>>>> containing newline.
>>>>>>> the left dumped plain file does not contain all the cluster
>>>>>>> databases data.
>>>>>>>
>>>>>>>
>>>>>>> if pg_dumpall non-text format aborts earlier,
>>>>>>> it's aligned with pg_dumpall plain format?
>>>>>>> it's also an improvement since aborts earlier, nothing will be
>>>>>>> dumped?
>>>>>>>
>>>>>>>
>>>>>>> am i missing something?
>>>>>>>
>>>>>>>
>>>>>> I think we should fix that.
>>>>>>
>>>>>> But for the current proposal, Álvaro and I were talking this
>>>>>> morning,
>>>>>> and we thought the simplest thing here would be to have the one line
>>>>>> format and escape NL/CRs in the database name.
>>>>>>
>>>>>>
>>>>>> cheers
>>>>>>
>>>>> Okay. As per discussions, we will keep one line entry for each
>>>>> database into map.file.
>>>>>
>>>>> Thanks all for feedback and review.
>>>>>
>>>>> Here, I am attaching updated patches for review and testing. These
>>>>> patches can be applied on commit a6524105d20b.
>>>>
>>>>
>>>> I'm working through this patch set with a view to committing it.
>>>> Attached is some cleanup which is where I got to today, although there
>>>> is more to do. One thing I am wondering is why not put the
>>>> SimpleDatabaseOidList stuff in fe_utils/simle_list.{c,h} ? That's
>>>> where all the similar stuff belongs, and it feels strange to have this
>>>> inline in pg_restore.c. (I also don't like the name much -
>>>> SimpleOidStringList or maybe SimpleOidPlusStringList might be better).
>>>>
>>>>
>>>>
>>>
>>> OK, I have done that, so here is the result. The first two are you
>>> original patches. patch 3 adds the new list type to fe-utils, and patch
>>> 4 contains my cleanups and use of the new list type. Apart from some
>>> relatively minor cleanup, the one thing I would like to change is how
>>> dumps are named. If we are producing tar or custom format dumps, I
>>> think
>>> the file names should reflect that (oid.dmp and oid.tar rather than a
>>> bare oid as the filename), and pg_restore should look for those. I'm
>>> going to work on that tomorrow - I don't think it will be terribly
>>> difficult.
>>>
>> Thanks Andrew.
>>
>> Here, I am attaching a delta patch for oid.tar and oid.dmp format.
>>
>
>
> OK, looks good, I have incorporated that.
>
> There are a couple of rough edges, though.
>
> First, I see this:
>
>
> andrew(at)ub22arm:inst $ bin/pg_restore -C -d postgres
> --exclude-database=regression_dummy_seclabel
> --exclude-database=regression_test_extensions
> --exclude-database=regression_test_pg_dump dest
> pg_restore: error: could not execute query: "ERROR: role "andrew"
> already exists
> "
> Command was: "
>
> --
> -- Roles
> --
>
> CREATE ROLE andrew;"
> pg_restore: warning: errors ignored on global.dat file restore: 1
> pg_restore: error: could not execute query: ERROR: database
> "template1" already exists
> Command was: CREATE DATABASE template1 WITH TEMPLATE = template0
> ENCODING = 'SQL_ASCII' LOCALE_PROVIDER = libc LOCALE = 'C';
>
>
> pg_restore: warning: errors ignored on database "template1" restore: 1
> pg_restore: error: could not execute query: ERROR: database
> "postgres" already exists
> Command was: CREATE DATABASE postgres WITH TEMPLATE = template0
> ENCODING = 'SQL_ASCII' LOCALE_PROVIDER = libc LOCALE = 'C';
>
>
> pg_restore: warning: errors ignored on database "postgres" restore: 1
> pg_restore: warning: errors ignored on restore: 3
>
>
>
> It seems pointless to be trying to create the rolw that we are
> connected as, and we also expect template1 and postgres to exist.
>
> In a similar vein, I don't see why we are setting the --create flag in
> pg_dumpall for those databases. I'm attaching a patch that is designed
> to stop that, but it doesn't solve the above issues.
>
> I also notice a bunch of these in globals.dat:
>
>
> --
> -- Databases
> --
>
> --
> -- Database "template1" dump
> --
>
> --
> -- Database "andrew" dump
> --
>
> --
> -- Database "isolation_regression_brin" dump
> --
>
> --
> -- Database "isolation_regression_delay_execution" dump
> --
>
> ...
>
>
> The patch also tries to fix this.
>
> Lastly, this badly needs some TAP tests written.
>
> I'm going to work on reviewing the documentation next.
>
>
>

I have reworked the documentation some. See attached.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment	Content-Type	Size
dumpall-docs.patch.noci	text/plain	10.1 KB

In response to

Re: Non-text mode for pg_dumpall at 2025-03-30 16:50:51 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexey Makhmutov	2025-03-30 22:19:29	Re: High CPU consumption in cascade replication with large number of walsenders and ConditionVariable broadcast issues
Previous Message	Tom Lane	2025-03-30 21:58:11	Re: general purpose array_sort