From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Euler Taveira <euler(at)eulerto(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: improve performance of pg_dump with many sequences |
Date: | 2024-07-10 22:05:11 |
Message-ID: | Zo8Fl53vBVS1Cnmt@nathan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jul 10, 2024 at 05:08:56PM -0300, Euler Taveira wrote:
> Nice improvement. The numbers for a realistic scenario (10k sequences) are
Thanks for taking a look!
> You are changing internal representation from char to int64. Is the main goal to
> validate catalog data? What if there is a new sequence data type whose
> representation is not an integer?
IIRC 0001 was primarily intended to reduce the amount of memory needed for
the sorted table. Regarding a new sequence data type, I'm assuming we'll
have much bigger fish to fry if we do that (e.g., pg_sequence uses int8 for
the values), and I'd hope that adjusting this code wouldn't be too
difficult, anyway.
> This code path is adding zero byte to the last position of the fixed string. I
> suggest that the zero byte is added to the position after the string length.
I'm not following why that would be a better approach. strncpy() will add
a NUL to the end of the string unless it doesn't fit in the buffer, in
which case we'll add our own via "seqtype[sizeof(seqtype) - 1] = '\0'".
Furthermore, the compiler can determine the position where the NUL should
be placed, whereas placing it at the end of the copied string requires a
runtime strlen().
> l = strlen(PQgetvalue(res, 0, 0));
> Assert(l < sizeof(seqtype));
> strncpy(seqtype, PQgetvalue(res, 0, 0), l);
> seqtype[l] = '\0';
I think the strncpy() should really be limited to the size of the seqtype
buffer. IMHO an Assert is not sufficient.
> If you are not planning to apply 0003, make sure you fix collectSequences() to
> avoid versions less than 10. Move this part to 0002.
Yeah, no need to create the table if we aren't going to use it.
> Since you apply a fix for pg_sequence_last_value function, you can simplify the
> query in 0003. CASE is not required.
Unfortunately, I think we have to keep this workaround since older minor
releases of PostgreSQL don't have the fix.
> patched (0001 and 0002):
> real 0m0,290s
> user 0m0,038s
> sys 0m0,104s
>
> I'm not sure if 0003 is worth. Maybe if you have another table like you
> suggested.
What pg_dump command did you test here? Did you dump the sequence data, or
was this --schema-only?
--
nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Alena Rybakina | 2024-07-10 23:30:22 | Re: POC, WIP: OR-clause support for indexes |
Previous Message | Nathan Bossart | 2024-07-10 21:41:01 | Re: improve predefined roles documentation |