From: | hubert depesz lubaczewski <depesz(at)depesz(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Can we get rid of repeated queries from pg_dump? |
Date: | 2021-08-26 16:06:44 |
Message-ID: | 20210826160644.GA3120@depesz.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On Thu, Aug 26, 2021 at 10:20:29AM -0400, Tom Lane wrote:
> Well, you could move it forward by doing the legwork to identify which
> queries are worth merging. Is it really sane to do a global "select
> format_type() from pg_type" query and save all the results on the client
> side? I wonder whether there are cases where that'd be a net loss.
> You could do the experimentation to figure that out without necessarily
> having the C skills to make pg_dump actually do it.
So, I got some info.
First, some stats. The DB contains:
- 14 extensions
- 1 aggregate
- 107 functions
- 5 schemas
- 5 sequences
- 188 logged tables
- 1 unlogged table
- 206 "normal" indexes
- 30 unique indexes
- 15 materialized views
- 16 triggers
- 87 types
- 26 views
pg_dump -s of it is ~ 670kB.
Interestingly, while dumping (pg_dump -s -v), we can see progress going on, and then, after:
====
...
pg_dump: reading publications
pg_dump: reading publication membership
pg_dump: reading subscriptions
pg_dump: reading dependency data
pg_dump: saving encoding = UTF8
pg_dump: saving standard_conforming_strings = on
pg_dump: saving search_path =
====
It stops (progress visible in console). And then, in pg logs I see queries like:
#v+
SELECT
proretset,
prosrc,
probin,
provolatile,
proisstrict,
prosecdef,
lanname,
proconfig,
procost,
prorows,
pg_catalog.pg_get_function_arguments(p.oid) AS funcargs,
pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs,
pg_catalog.pg_get_function_result(p.oid) AS funcresult,
proleakproof,
array_to_string(protrftypes, ' ') AS protrftypes,
proparallel,
prokind,
prosupport,
NULL AS prosqlbody
FROM pg_catalog.pg_proc p, pg_catalog.pg_language l
WHERE p.oid = '43875'::pg_catalog.oid AND l.oid = p.prolang
#v-
Now for query stats.
To dump it all, pg_dump needed 9173 queries (logged by
log_min_duration_statement = 0 for this user).
I extracted all queries to separate files, and made stats. In total there were
only 4257 unique queries.
Then I checked for repeated queries. Top 10 most repeated offenders were:
615 times : SELECT pg_catalog.format_type('25'::pg_catalog.oid, NULL)
599 times : SELECT pg_catalog.format_type('23'::pg_catalog.oid, NULL)
579 times : SELECT pg_catalog.format_type('2281'::pg_catalog.oid, NULL)
578 times : SELECT pg_catalog.format_type('41946'::pg_catalog.oid, NULL)
523 times : SELECT pg_catalog.format_type('701'::pg_catalog.oid, NULL)
459 times : SELECT pg_catalog.format_type('42923'::pg_catalog.oid, NULL)
258 times : SELECT pg_catalog.format_type('16'::pg_catalog.oid, NULL)
176 times : SELECT pg_catalog.format_type('19'::pg_catalog.oid, NULL)
110 times : SELECT pg_catalog.format_type('21'::pg_catalog.oid, NULL)
106 times : SELECT pg_catalog.format_type('42604'::pg_catalog.oid, NULL)
In total, there were 5000 queries:
SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL)
But there were only 83 separate oids that were scanned.
The only other repeated command was:
SELECT pg_catalog.set_config('search_path', '', false);
and it was called only twice.
Based on my reading of queries in order it seems to follow the pattern of:
One call for:
SELECT proretset, prosrc, probin, provolatile, proisstrict, prosecdef, lanname, proconfig, procost, prorows, pg_catalog.pg_get_function_arguments(p.oid) AS funcargs, pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs, pg_catalog.pg_get_function_re
sult(p.oid) AS funcresult, proleakproof, array_to_string(protrftypes, ' ') AS protrftypes, proparallel, prokind, prosupport, NULL AS prosqlbody FROM pg_catalog.pg_proc p, pg_catalog.pg_language l WHERE p.oid = 'SOME_NUMBER'::pg_catalog.oid AND l.oid = p.prolang
and then one or more:
SELECT pg_catalog.format_type('SOME_NUMBER'::pg_catalog.oid, NULL)
In one case, after proc query, there were 94 concecutive
pg_catalog.format_type queries.
I hope it helps.
Best regards,
depesz
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-08-26 19:43:10 | Re: lower() and unaccent() not leakproof |
Previous Message | Daniel Gustafsson | 2021-08-26 15:46:14 | Re: lower() and unaccent() not leakproof |
From | Date | Subject | |
---|---|---|---|
Next Message | Vladimir Sitnikov | 2021-08-26 16:08:54 | Re: speed up verifying UTF-8 |
Previous Message | Robert Haas | 2021-08-26 15:52:00 | Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce) |