From: | Dimitrios Apostolou <jimis(at)gmx(dot)net> |
---|---|
To: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
Cc: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Experience and feedback on pg_restore --data-only |
Date: | 2025-03-24 15:51:30 |
Message-ID: | 7e990eae-e55c-0d04-1be8-f49bb3251073@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 24 Mar 2025, Adrian Klaver wrote:
> On 3/24/25 07:24, Dimitrios Apostolou wrote:
>> On Sun, 23 Mar 2025, Laurenz Albe wrote:
>>
>>> On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote:
>>>> Performance issues: (important as my db size is >5TB)
>>>>
>>>> * WAL writes: I didn't manage to avoid writing to the WAL, despite
>>>> having
>>>> setting wal_level=minimal. I even wrote my own function to ALTER all
>>>> tables to UNLOGGED, but failed with "could not change table T to
>>>> unlogged because it references logged table". I'm out of ideas on
>>>> this
>>>> one.
>>>
>>> You'd have to create an load the table in the same transaction, that is,
>>> you'd have to run pg_restore with --single-transaction.
>>
>> That would restore the schema from the dump, while I want to create the
>> schema from the SQL code in version control.
>
>
> I am not following, from your original post:
>
> "
> ... create a
> clean database by running the SQL schema definition from version control, and
> then copy the data for only the tables created.
>
> For this case, I choose to run pg_restore --data-only, and run it as the user
> who owns the database (dbowner), not as a superuser, in order to avoid
> changes being introduced under the radar.
> "
>
> You are running the process in two steps, where the first does not involve
> pg_restore. Not sure why doing the pg_restore --data-only portion in single
> transaction is not possible?
Laurenz informed me that I could avoid writing to the WAL if I "create and
load the table in a single transaction".
I haven't tried, but here is what I would do to try --single-transaction:
Transaction 1: manually issuing all of CREATE TABLE etc.
Transaction 2: pg_restore --single-transaction --data-only
The COPY command in transaction 2 would still need to write to WAL, since
it's separate from the CREATE TABLE.
Am I wrong somewhere?
>> Something that might work, would be for pg_restore to issue a TRUNCATE
>> before the COPY. I believe this would require superuser privelege though,
>> that I would prefer to avoid. Currently I issue TRUNCATE for all tables
>> manually before running pg_restore, but of course this is in a different
>> transaction so it doesn't help.
>>
>> By the way do you see potential problems with using --single-transaction
>> to restore billion-rows tables?
>
> COPY is all or none(version 17+ caveat(see
> https://www.postgresql.org/docs/current/sql-copy.html ON_ERROR)), so if the
> data dump fails in --single-transaction everything rolls back.
So if I restore all tables, then an error about a "table not found" would
not roll back already copied tables, since it's not part of a COPY?
Thank you for the feedback,
Dimitris
From | Date | Subject | |
---|---|---|---|
Next Message | Ron Johnson | 2025-03-24 16:00:15 | Re: Experience and feedback on pg_restore --data-only |
Previous Message | shammat | 2025-03-24 15:51:28 | Re: Experience and feedback on pg_restore --data-only |