Re: improve performance of pg_dump with many sequences

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Nathan Bossart" <nathandbossart(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve performance of pg_dump with many sequences
Date: 2024-07-10 20:08:56
Message-ID: 0a0f4d52-ab75-4d69-abe8-5713bf70eee6@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 9, 2024, at 4:11 PM, Nathan Bossart wrote:
> rebased

Nice improvement. The numbers for a realistic scenario (10k sequences) are

for i in `seq 1 10000`; do echo "CREATE SEQUENCE s$i;"; done > /tmp/s.sql

master:
real 0m1,141s
user 0m0,056s
sys 0m0,147s

patched:
real 0m0,410s
user 0m0,045s
sys 0m0,103s

You are changing internal representation from char to int64. Is the main goal to
validate catalog data? What if there is a new sequence data type whose
representation is not an integer?

This code path is adding zero byte to the last position of the fixed string. I
suggest that the zero byte is added to the position after the string length.

Assert(strlen(PQgetvalue(res, 0, 0)) < sizeof(seqtype));
strncpy(seqtype, PQgetvalue(res, 0, 0), sizeof(seqtype));
seqtype[sizeof(seqtype) - 1] = '\0';

Something like

l = strlen(PQgetvalue(res, 0, 0));
Assert(l < sizeof(seqtype));
strncpy(seqtype, PQgetvalue(res, 0, 0), l);
seqtype[l] = '\0';

Another suggestion is to use a constant for seqtype

char seqtype[MAX_SEQNAME_LEN];

and simplify the expression:

size_t seqtype_sz = sizeof(((SequenceItem *) 0)->seqtype);

If you are not planning to apply 0003, make sure you fix collectSequences() to
avoid versions less than 10. Move this part to 0002.

@@ -17233,11 +17235,24 @@ collectSequences(Archive *fout)
PGresult *res;
const char *query;

+ if (fout->remoteVersion < 100000)
+ return;
+

Since you apply a fix for pg_sequence_last_value function, you can simplify the
query in 0003. CASE is not required.

I repeated the same test but not applying 0003.

patched (0001 and 0002):
real 0m0,290s
user 0m0,038s
sys 0m0,104s

I'm not sure if 0003 is worth. Maybe if you have another table like you
suggested.

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-07-10 21:01:34 Re: Assertion failure with summarize_wal enabled during pg_createsubscriber
Previous Message Nathan Bossart 2024-07-10 19:32:06 Re: pg_maintain and USAGE privilege on schema