Re: request clarification on pg_restore documentation

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: jcarson(at)fastmail(dot)com
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: request clarification on pg_restore documentation
Date: 2023-10-27 22:33:38
Message-ID: ZTw6wv8rkkC8IMk0@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers


Does anyone have a suggestion on how to handle this issue? The report
is a year old.

---------------------------------------------------------------------------

On Thu, Oct 6, 2022 at 10:28:15AM -0400, Bruce Momjian wrote:
> On Thu, Sep 29, 2022 at 02:30:13PM +0000, PG Doc comments form wrote:
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/14/app-pgrestore.html
> > Description:
> >
> > pg_restore seems to have two ways to restore data:
> >
> > --section=data
> >
> > or
> >
> > --data-only
> >
> > There is this cryptic warning that --data-only is "similar to but for
> > historical reasons different from" --section=data
> >
> > But there is no further explanation of what those differences are or what
> > might be missed or different in your restore if you pick one option or the
> > other. Maybe one or the other option is the "preferred current way" and one
> > is the "historical way" or they are aimed at different types of use cases,
> > but that's not clear.
>
> [Thread moved from docs to hackers because there are behavioral issues.]
>
> Very good question. I dug into this and found this commit which says
> --data-only and --section=data were equivalent:
>
> commit a4cd6abcc9
> Author: Andrew Dunstan <andrew(at)dunslane(dot)net>
> Date: Fri Dec 16 19:09:38 2011 -0500
>
> Add --section option to pg_dump and pg_restore.
>
> Valid values are --pre-data, data and post-data. The option can be
> given more than once. --schema-only is equivalent to
> --> --section=pre-data --section=post-data. --data-only is equivalent
> --> to --section=data.
>
> and then this commit which says they are not:
>
> commit 4317e0246c
> Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Date: Tue May 29 23:22:14 2012 -0400
>
> Rewrite --section option to decouple it from --schema-only/--data-only.
>
> --> The initial implementation of pg_dump's --section option supposed that the
> --> existing --schema-only and --data-only options could be made equivalent to
> --> --section settings. This is wrong, though, due to dubious but long since
> set-in-stone decisions about where to dump SEQUENCE SET items, as seen in
> bug report from Martin Pitt. (And I'm not totally convinced there weren't
> other bugs, either.) Undo that coupling and instead drive --section
> filtering off current-section state tracked as we scan through the TOC
> list to call _tocEntryRequired().
>
> To make sure those decisions don't shift around and hopefully save a few
> cycles, run _tocEntryRequired() only once per TOC entry and save the result
> in a new TOC field. This required minor rejiggering of ACL handling but
> also allows a far cleaner implementation of inhibit_data_for_failed_table.
>
> Also, to ensure that pg_dump and pg_restore have the same behavior with
> respect to the --section switches, add _tocEntryRequired() filtering to
> WriteToc() and WriteDataChunks(), rather than trying to implement section
> filtering in an entirely orthogonal way in dumpDumpableObject(). This
> required adjusting the handling of the special ENCODING and STDSTRINGS
> items, but they were pretty weird before anyway.
>
> and this commit which made them closer:
>
> commit 5a39114fe7
> Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Date: Fri Oct 26 12:12:42 2012 -0400
>
> In pg_dump, dump SEQUENCE SET items in the data not pre-data section.
>
> Represent a sequence's current value as a separate TableDataInfo dumpable
> object, so that it can be dumped within the data section of the archive
> --> rather than in pre-data. This fixes an undesirable inconsistency between
> --> the meanings of "--data-only" and "--section=data", and also fixes dumping
> of sequences that are marked as extension configuration tables, as per a
> report from Marko Kreen back in July. The main cost is that we do one more
> SQL query per sequence, but that's probably not very meaningful in most
> databases.
>
> Looking at the restore code, I see --data-only disabling triggers, while
> --section=data doesn't. I also tested --data-only vs. --section=data in
> pg_dump for the regression database and saw the only differences as the
> creation and comments on large objects, e.g.,
>
> -- Name: 2121; Type: BLOB; Schema: -; Owner: postgres
> --
>
> SELECT pg_catalog.lo_create('2121');
>
>
> ALTER LARGE OBJECT 2121 OWNER TO postgres;
>
> --
> -- Name: LARGE OBJECT 2121; Type: COMMENT; Schema: -; Owner: postgres
> --
>
> COMMENT ON LARGE OBJECT 2121 IS 'testing comments';
>
> but the large object _data_ was dumped in both cases.
>
> So, where does this leave us? We know we need --section=data because
> the pre/post-data options are clearly useful, so why would someone use
> --data-only vs. --section=data. We don't document why to use one rather
> than the other, so the --data-only option looks useless to me. Do we
> remove it, adjust it, or leave it alone?
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
> EDB https://enterprisedb.com
>
> Indecision is a decision. Inaction is an action. Mark Batterson
>
>
>

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2023-10-27 22:53:00 Re: request clarification on pg_restore documentation
Previous Message Bruce Momjian 2023-10-27 21:23:45 Re: Minor inconsistencies

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-10-27 22:40:55 Re: race condition in pg_class
Previous Message Steven Rostedt 2023-10-27 21:52:16 Re: [POC][RFC][PATCH v2] sched: Extended Scheduler Time Slice