Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org
Subject: Re: Statistics Import and Export
Date: 2025-01-16 15:52:25
Message-ID: CADkLM=dKGiH1Qkd2h_D36dDg0U5=4X7CG8odp3knyANACcNZMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> pg_dump --dbname=src2 --table=tenk1 --statistics-only --verbose > x.sql
> there no pg_restore_attribute_stats, pg_restore_relation_stats there
> for table tenk1.
> There aren't any statistics, should there be?
>

pg_restore_relation_stats: yes. looking into that.
pg_restore_attribute_stats: yes IF the table had been analyzed, no
otherwise.

> also should
> pg_dump --dbname=src2 --no-data --no-schema --table=tenk1
> dump the statistics? currently seems no.
>

> in doc/src/sgml/ref/pg_dump.sgml
> there are six options to control the main output now.
> --schema-only, --statistics-only, --data-only,
> --no-schema, --no-data, --no-statistics
> maybe we need spare one paragraph to explain the default behavior,
> and also have an example on it?
>

+1

> pg_dump --dbname=src2 --table=tenk1 --verbose > 1.sql
> seems the statistics dump (pg_restore_attribute_stats) is ordered by
> attribute name.
> should it make more sense to order by attnum?
>

For security reasons, we pull attribute statistics from pg_stats (
https://www.postgresql.org/docs/current/view-pg-stats.html) which does not
have attnum. The only reason we order the list is to ensure that dump
comparison tests match.

> getRelationStatistics
> typedef struct _relStatsInfo
> {
> DumpableObject dobj;
> char relkind; /* 'r', 'v', 'c', etc */
> bool postponed_def;
> } RelStatsInfo;
> comment /* 'r', 'v', 'c', etc */
> Is it wrong? there is no relkind='c'.
>

Yeah, I think 'c' should be an 'm' there.

> field postponed_def really deserves a comment.
>

Can do.

>
>
> we also need change
> enum dbObjectTypePriorities
> static const int dbObjectTypePriority[]
> ?
>

We'd need two entries, because that enum includes PRIO_PRE_DATA_BOUNDARY
and PRIO_POST_DATA_BOUNDARY, and statistics can either be in DATA or
POST_DATA.

>
>
> in dumpRelationStats, we can add Assert on it.
> if (!fout->dopt->dumpStatistics)
> return;
> Assert(dobj->dump & DUMP_COMPONENT_STATISTICS);
>
>
> I found out the owner's info is missing in the dumped content.
> for example, the line "Name: STATISTICS DATA tenk1_pkey;" missing owner
> info.
> not sure this is intended?

Good question. I'm not sure if we need it or not. If stats had an owner,
it'd be the owner of the relation.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-01-16 15:59:31 Re: per backend WAL statistics
Previous Message Álvaro Herrera 2025-01-16 15:48:04 Re: Allow NOT VALID foreign key constraints on partitioned tables.