Support --include-analyze in pg_dump, pg_dumpall, pg_restore

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Support --include-analyze in pg_dump, pg_dumpall, pg_restore
Date: 2025-01-14 12:54:02
Message-ID: CACJufxGx_YLLxmnZVMuUApiVYqjKrydvEQqD3qxhwM6e0_8jrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hi.
after watching https://www.youtube.com/live/k4A9-WZET_4?si=vz3lTud735s2vcCO
then trying to hack it too.

the implemented feature is as the $subject description, making pg_dump
also includes the ANALYZE command in its output.

if option --include-analyze not specified, then pg_dump will not dump
ANALYZE, that's the status quo.
option --include-analyze support table relkind: RELKIND_PARTITIONED_TABLE,
RELKIND_MATVIEW, RELKIND_RELATION.

this option cannot be used with option --schema-only.
for materialized view, it will dump after the refresh materialized
view command.
(tested in several cases, ANALYZE will really be at the end of the dump).

By default, pg_restore does not restore the ANALYZE commands.
This means that if the archive contains ANALYZE commands,
you still need to explicitly specify the --include-analyze option to
restore them.

doc added.
demo for dump:
create materialized view mvw as select a from generate_series(1, 3) a;

---<<<<<<<<<<<<<<portion of pg_dump output starts>>>>>>>>>>>>>>>>>>>>>>
--
-- Name: mvw; Type: MATERIALIZED VIEW DATA; Schema: public; Owner: jian
--
REFRESH MATERIALIZED VIEW public.mvw;

--
-- Name: mvw; Type: ANALYZE; Schema: public; Owner: jian
--

ANALYZE public.mvw;

---<<<<<<<<<<<<<<portion of pg_dump output ends>>>>>>>>>>>>>>>>>>>>>>
demo for dump partitioned table:
CREATE TABLE prt1 (a int, b int, c text) PARTITION BY RANGE(a) ;
CREATE TABLE prt1_p1 PARTITION OF prt1 FOR VALUES FROM (0) TO (5);
CREATE TABLE prt1_p2 PARTITION OF prt1 FOR VALUES FROM (8) TO (11);
INSERT INTO prt1 SELECT i, i+11, to_char(i, 'FM0000') FROM
generate_series(0, 2) i;

pg_dump --table=*prt1* --include-analyze
---<<<<<<<<<<<<<<portion of output ends>>>>>>>>>>>>>>>>>>>>>>
--
-- Data for Name: prt1_p1; Type: TABLE DATA; Schema: public; Owner: jian
--
COPY public.prt1_p1 (a, b, c) FROM stdin;
0 11 0000
1 12 0001
2 13 0002
\.

--
-- Data for Name: prt1_p2; Type: TABLE DATA; Schema: public; Owner: jian
--
COPY public.prt1_p2 (a, b, c) FROM stdin;
\.
--
-- Name: prt1; Type: ANALYZE; Schema: public; Owner: jian
--
ANALYZE ONLY public.prt1;
--
-- Name: prt1_p1; Type: ANALYZE; Schema: public; Owner: jian
--
ANALYZE public.prt1_p1;
--
-- Name: prt1_p2; Type: ANALYZE; Schema: public; Owner: jian
--

ANALYZE public.prt1_p2;

---<<<<<<<<<<<<<<portion of output starts>>>>>>>>>>>>>>>>>>>>>>
TODO item: writing tests.

idea credits to Andrey Borodin, Nikolay Samokhvalov, Kirk Wolak

what do you think?

Attachment Content-Type Size
v1-0001-Support-include-analyze-in-pg_dump-pg_dumpall-pg_.patch text/x-patch 22.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-01-14 13:06:31 Re: Reorder shutdown sequence, to flush pgstats later
Previous Message Hayato Kuroda (Fujitsu) 2025-01-14 12:12:50 RE: Replace current implementations in crypt() and gen_salt() to OpenSSL