From: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Experience and feedback on pg_restore --data-only |
Date: | 2025-03-24 16:00:15 |
Message-ID: | CANzqJaAXrDjPqLUt7VN2KOUeQBdoWsy9mfghSr5a-0kuf_pQUg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Why are you regularly having emergencies requiring the restoration of
multi-TB tables to databases with lots of cruft?
Fixing that would go a long way towards eliminating your problems with
pg_restore.
On Mon, Mar 24, 2025 at 11:51 AM Dimitrios Apostolou <jimis(at)gmx(dot)net> wrote:
> On Mon, 24 Mar 2025, Adrian Klaver wrote:
>
> > On 3/24/25 07:24, Dimitrios Apostolou wrote:
> >> On Sun, 23 Mar 2025, Laurenz Albe wrote:
> >>
> >>> On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote:
> >>>> Performance issues: (important as my db size is >5TB)
> >>>>
> >>>> * WAL writes: I didn't manage to avoid writing to the WAL, despite
> >>>> having
> >>>> setting wal_level=minimal. I even wrote my own function to ALTER
> all
> >>>> tables to UNLOGGED, but failed with "could not change table T to
> >>>> unlogged because it references logged table". I'm out of ideas on
> >>>> this
> >>>> one.
> >>>
> >>> You'd have to create an load the table in the same transaction, that
> is,
> >>> you'd have to run pg_restore with --single-transaction.
> >>
> >> That would restore the schema from the dump, while I want to create the
> >> schema from the SQL code in version control.
> >
> >
> > I am not following, from your original post:
> >
> > "
> > ... create a
> > clean database by running the SQL schema definition from version
> control, and
> > then copy the data for only the tables created.
> >
> > For this case, I choose to run pg_restore --data-only, and run it as the
> user
> > who owns the database (dbowner), not as a superuser, in order to avoid
> > changes being introduced under the radar.
> > "
> >
> > You are running the process in two steps, where the first does not
> involve
> > pg_restore. Not sure why doing the pg_restore --data-only portion in
> single
> > transaction is not possible?
>
> Laurenz informed me that I could avoid writing to the WAL if I "create and
> load the table in a single transaction".
> I haven't tried, but here is what I would do to try --single-transaction:
>
> Transaction 1: manually issuing all of CREATE TABLE etc.
>
> Transaction 2: pg_restore --single-transaction --data-only
>
> The COPY command in transaction 2 would still need to write to WAL, since
> it's separate from the CREATE TABLE.
>
> Am I wrong somewhere?
>
> >> Something that might work, would be for pg_restore to issue a TRUNCATE
> >> before the COPY. I believe this would require superuser privelege
> though,
> >> that I would prefer to avoid. Currently I issue TRUNCATE for all tables
> >> manually before running pg_restore, but of course this is in a
> different
> >> transaction so it doesn't help.
> >>
> >> By the way do you see potential problems with using
> --single-transaction
> >> to restore billion-rows tables?
> >
> > COPY is all or none(version 17+ caveat(see
> > https://www.postgresql.org/docs/current/sql-copy.html ON_ERROR)), so
> if the
> > data dump fails in --single-transaction everything rolls back.
>
> So if I restore all tables, then an error about a "table not found" would
> not roll back already copied tables, since it's not part of a COPY?
>
>
> Thank you for the feedback,
> Dimitris
>
>
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitrios Apostolou | 2025-03-24 16:05:52 | Re: Experience and feedback on pg_restore --data-only |
Previous Message | Dimitrios Apostolou | 2025-03-24 15:51:30 | Re: Experience and feedback on pg_restore --data-only |