Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)

From: Thomas Simpson <ts(at)talentstack(dot)to>
To: Scott Ribe <scott_ribe(at)elevated-dev(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)
Date: 2024-07-19 13:46:14
Message-ID: 6107c011-f907-48ba-94e4-884f59c3da1a@talentstack.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

Hi Scott,

I realize some of the background was snipped on what I sent to the
hacker list, I'll try to fill in the details.

Short background is very large database ran out of space during vacuum
full taking down the server.  There is a replica which was applying the
WALs and so it too ran out of space.  On restart after clearing some
space, the database came back up but left over the in-progress rebuild
files.  I've cleared that replica and am using it as my rebuild target
just now.

Trying to identify the 'orphan' files and move them away always led to
the database spotting the supposedly unused files having gone and
refusing to start, so I had no successful way to clean up and get space
back.

Last resort after discussion is pg_dumpall & reload.  I'm doing this via
a network pipe (netcat) as I do not have the vast amount of storage
necessary for the dump file to be stored (in any format).

On 19-Jul-2024 09:26, Scott Ribe wrote:
> Do you actually have 100G networking between the nodes? Because if not, a single CPU should be able to saturate 10G.
Servers connect via 10G WAN; sending is not the issue, it's application
of the incoming stream on the destination which is bottlenecked.
>
> Likewise the receiving end would need disk capable of keeping up. Which brings up the question, why not write to disk, but directly to the destination rather than write locally then copy?
In this case, it's not a local write, it's piped via netcat.
> Do you require dump-reload because of suspected corruption? That's a tough one. But if not, if the goal is just to get up and running on a new server, why not pg_basebackup, streaming replica, promote? That depends on the level of data modification activity being low enough that pg_basebackup can keep up with WAL as it's generated and apply it faster than new WAL comes in, but given that your server is currently keeping up with writing that much WAL and flushing that many changes, seems likely it would keep up as long as the network connection is fast enough. Anyway, in that scenario, you don't need to care how long pg_basebackup takes.
>
> If you do need a dump/reload because of suspected corruption, the only thing I can think of is something like doing it a table at a time--partitioning would help here, if practical.

The basebackup is, to the best of my understanding, essentially just
copying the database files.  Since the failed vacuum has left extra
files, my expectation is these too would be copied, leaving me in the
same position I started in.  If I'm wrong, please tell me as that would
be vastly quicker - it is how I originally set up the replica and it
took only a few hours on the 10G link.

The inability to get a clean start if I move any files out the way leads
me to be concerned for some underlying corruption/issue and the
recommendation earlier in the discussion was opt for dump/reload as the
fail-safe.

Resigned to my fate, my thoughts were to see if there is a way to
improve the dump-reload approach for the future.  Since dump-reload is
the ultimate upgrade suggestion in the documentation, it seems
worthwhile to see if there is a way to improve the performance of that
especially as very large databases like mine are a thing with
PostgreSQL.  From a quick review of pg_dump.c (I'm no expert on it
obviously), it feels like it's already doing most of what needs done and
the addition is some sort of multi-thread coordination with a restore
client to ensure each thread can successfully complete each task it has
before accepting more work.  I realize that's actually difficult to
implement.

Thanks

Tom

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Ribe 2024-07-19 19:34:56 Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)
Previous Message Scott Ribe 2024-07-19 13:26:47 Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2024-07-19 13:49:50 Re: Patch bug: Fix jsonpath .* on Arrays
Previous Message Tomas Vondra 2024-07-19 13:28:44 Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx