From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com> |
Subject: | Reducing output size of nodeToString |
Date: | 2023-12-06 21:08:38 |
Message-ID: | CAEze2WgrCiR3JZmWyB0YTc8HV7ewRdx13j0CqD6mVkYAW+SFGQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
PFA a patch that reduces the output size of nodeToString by 50%+ in
most cases (measured on pg_rewrite), which on my system reduces the
total size of pg_rewrite by 33% to 472KiB. This does keep the textual
pg_node_tree format alive, but reduces its size signficantly.
The basic techniques used are
- Don't emit scalar fields when they contain a default value, and
make the reading code aware of this.
- Reasonable defaults are set for most datatypes, and overrides can
be added with new pg_node_attr() attributes. No introspection into
non-null Node/Array/etc. is being done though.
- Reset more fields to their default values before storing the values.
- Don't write trailing 0s in outDatum calls for by-ref types. This
saves many bytes for Name fields, but also some other pre-existing
entry points.
Future work will probably have to be on a significantly different
storage format, as the textual format is about to hit its entropy
limits.
See also [0], [1] and [2], where complaints about the verbosity of
nodeToString were vocalized.
Kind regards,
Matthias van de Meent
[0] https://www.postgresql.org/message-id/CAEze2WgGexDM63dOvndLdAWwA6uSmSsc97jmrCuNmrF1JEDK7w%40mail.gmail.com
[1] https://www.postgresql.org/message-id/flat/CACxu%3DvL_SD%3DWJiFSJyyBuZAp_2v_XBqb1x9JBiqz52a_g9z3jA%40mail.gmail.com
[2] https://www.postgresql.org/message-id/4b27fc50-8cd6-46f5-ab20-88dbaadca645%40eisentraut.org
Attachment | Content-Type | Size |
---|---|---|
v0-0001-Reduce-the-size-of-serialized-nodes-in-nodeToStri.patch | application/octet-stream | 55.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Matthias van de Meent | 2023-12-06 21:20:03 | Re: automating RangeTblEntry node support |
Previous Message | Andrew Dunstan | 2023-12-06 21:03:12 | Re: Emitting JSON to file using COPY TO |