Re: Reducing output size of nodeToString

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>
Subject: Re: Reducing output size of nodeToString
Date: 2024-02-15 12:59:04
Message-ID: 018f198b-bbd7-49ad-ac49-4d719d1465d8@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks, this patch set is a good way to incrementally work through these
changes.

I have looked at
v4-0001-pg_node_tree-Omit-serialization-of-fields-with-de.patch today.
Here are my thoughts:

I believe we had discussed offline to not omit enum fields with value 0
(WRITE_ENUM_FIELD). This is because the values of enum fields are
implementation artifacts, and this could be confusing for readers.
(This could be added as a squeeze-out-every-byte change later, but if
we're going to keep the format fit for human reading, I think we should
skip this.)

I have some concerns about the round-trippability of float values. If
we do, effectively, if (node->fldname != 0.0), then I think this would
also match negative zero, but when we read it back it, it would get
assigned positive zero. Maybe there are other edge cases like this.
Might be safer to not mess with this.

On the reading side, the macro nesting has gotten a bit out of hand. :)
We had talked earlier in the thread about the _DIRECT macros and you
said there were left over from something else you want to try, but I see
nothing else in this patch set uses this. I think this could all be
much simpler, like (omitting required punctuation)

#define READ_INT_FIELD(fldname, default)
if ((token = next_field(fldname, &length)))
local_node->fldname = atoi(token);
else
local_node->fldname = default;

where next_field() would

1. read the next token
2. if it is ":fldname", continue;
else rewind the read pointer and return NULL
3. read the next token and return that

Not only is this simpler, but it might also have better performance,
because we don't have separate pg_strtok_next() and pg_strtok() calls in
sequence.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2024-02-15 13:34:12 Re: When extended query protocol ends?
Previous Message Daniel Gustafsson 2024-02-15 12:58:45 Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions