Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Subject: Re: Statistics Import and Export
Date: 2024-05-16 09:25:58
Message-ID: CADkLM=dse9b6VYkgt8MMe3htt=tD1bZOdk3KvH=3Gt17LqSVXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Can you explain what you did with the
> SECTION_NONE/SECTION_DATA/SECTION_POST_DATA over v19-v21 and why?
>

Initially, I got things to work by having statistics import behave like
COMMENTs, which meant that they were run immediately after the
table/matview/index/constraint that created the pg_class/pg_attribute
entries, but they could be suppressed with a --noX flag

Per previous comments, it was suggested by others that:

- having them in SECTION_NONE was a grave mistake
- Everything that could belong in SECTION_DATA should, and the rest should
be in SECTION_POST_DATA
- This would almost certainly require the statistics import commands to be
TOC objects (one object per pg_class entry, not one object per function
call)

Turning them into TOC objects was a multi-phase process.

1. the TOC entries are generated with dependencies (the parent pg_class
object as well as the potential unique/pk constraint in the case of
indexes), but no statements are generated (in case the stats are filtered
out or the parent object is filtered out). This TOC entry must have
everything we'll need to later generate the function calls. So far, that
information is the parent name, parent schema, and relkind of the parent
object.

2. The TOC entries get sorted by dependencies, and additional dependencies
are added which enforce the PRE/DATA/POST boundaries. This is where knowing
the parent object's relkind is required, as that determines the DATA/POST
section.

3. Now the TOC entry is able to stand on its own, and generate the
statements if they survive the dump/restore filters. Most of the later
versions of the patch were efforts to get the objects to fall into the
right PRE/DATA/POST sections, and the central bug was that the dependencies
passed into ARCHIVE_OPTS were incorrect, as the dependent object passed in
was now the new TOC object, not the parent TOC object. Once that was
resolved, things fell into place.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-05-16 09:43:12 Re: Minor cleanups in the SSL tests
Previous Message Jelte Fennema-Nio 2024-05-16 09:21:56 Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs