Re: How to deal with dangling files after aborted `pg_restore`?

From: Saul Perdomo <saul(dot)perdomo(at)gmail(dot)com>
To: Ivan Kurnosov <zerkms(at)zerkms(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: How to deal with dangling files after aborted `pg_restore`?
Date: 2024-12-19 00:02:30
Message-ID: CAN3jBgE6nWgkXRZWJHyGA=p6M0-W1vk-h6_WULnQXEwEEn8pVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Genuine question: Why are you looking to recover from this half-cooked
state instead of restarting the restore process from the beginning?

On Tue, Dec 17, 2024, 1:10 a.m. Ivan Kurnosov <zerkms(at)zerkms(dot)com> wrote:

> The scenario:
>
> 1. There is a postgresql 17 server running
> 2. Restore dump with `--single-transaction` flag
> 3. For whatever reason the server goes away (eg: we kill the process)
> 4. Now `base` directory is filled with abandoned table files which
> postgresql know nothing about
>
> Playground:
>
> Terminal 1:
> Start server: `docker run --rm -it -e POSTGRES_HOST_AUTH_METHOD=trust -v
> $PWD/postgres:/var/lib/postgresql/data postgres:17.2`
>
> Terminal 2:
> 1. Start container with recent pg_restore: `docker run --rm -it -v
> $PWD:/app -w /app postgres:17.2 bash`
> 2. Start dump recovery: `pg_restore -h 172.17.0.2 -U postgres -d postgres
> --single-transaction -v -Fc --no-owner dump.sql`
>
> Terminal 3:
> 1. Find what container is the server: `docker ps`
> 2. Kill it: `docker kill d7ecf6e66c1d`
>
> Terminal 1:
> Start the server again, with the same command
>
> Terminal 3:
> Check there are abandoned large files:
> ```
> # ls -la /home/ivan/postgres/base/5
> <truncated>
> -rw------- 1 systemd-coredump systemd-coredump 342884352 Dec 17 18:58 16399
> -rw------- 1 systemd-coredump systemd-coredump 11149312 Dec 17 18:58 16404
> -rw------- 1 systemd-coredump systemd-coredump 188416 Dec 17 18:58
> 16403_fsm
> -rw------- 1 systemd-coredump systemd-coredump 686145536 Dec 17 18:58 16403
> ```
>
> Terminal 2:
> 1. Confirm those OIDs are not accounted:
> ```
> psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16404
> psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16403
> psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16399
> ```
>
> Hence a question: am I doing something wrong? Is it expected behaviour? If
> so - how would one recover from this scenario now WITHOUT dropping entire
> database?
>
> --
> With best regards, Ivan Kurnosov
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Garcia Badaracco 2024-12-19 03:47:15 Wrapping a where clause to preserve rows with nulls
Previous Message Saul Perdomo 2024-12-19 00:00:14 Re: Query about pg_wal directory filled up