Re: Redacting information from logs

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Redacting information from logs
Date: 2019-08-05 21:26:44
Message-ID: 3a3c53f835db85b578228916c928146ca788c131.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2019-08-04 at 11:17 -0700, Andres Freund wrote:
> I'm imagining something like
>
> #define pg_printf(fmt, ...) \
> do { \
> if ( __builtin_constant_p(fmt)) \
> { \
> static processed_fmt processed_fmt_ = {.format = fmt}; \
> if (unlikely(!processed_fmt_.parsed)) \
> preprocess_format_string(&processed_fmt_) \
> pg_printf_processed(&processed_fmt_, __VA_ARGS__); \
> } \
> else \
> pg_printf_unprocessed(fmt, __VA_ARGS__); \
> } while (0) \

What would you do in the preprocessing exactly? Create a list of
indexes into the string where the format codes are?

> I think we've quite repeatedly had requests for a log format that can
> be
> parsed reasonably, and annotated with additional information that's
> too
> verbose for the main message. It's a pain to have to parse quite
> complex strings in production, just to extract the value of the
> actual
> information in the message.
>
> ts="2019-08-04 10:19:21.286 PDT" pid=32416 vxid=12/9 sev=LOG \
> msg="Prozess 15027 erlangte AccessShareLock-Sperre auf Relation
> 1259 der Datenbank 13416 nach 29808,865 ms" \
> fmt="process %d acquired %s on %s after %ld.%03d ms" \
> p:modename="AccessShareLock-Sperre auf Relation 1259" \
> p:msec=18343 p:usec=173 \
> p:locktag=190345/134343/0/0/relation/default \
> c:statement="LOCK TABLE pg_class;" \
> l:func=ProcSleep l:file=proc.c l:line=1495
>

...

> If we have enough information to pass to the logging hook, we don't
> even
> need to define how all of this is going to look like exactly
> (although
> I'd probably argue that a logfmt or json target ought to be in core).

I think I see where you are going with this now: it is almost
orthogonal to your new-style format strings ( %{foo}s ), but not quite.

You're suggesting that we save all of the arguments, along with some
annotation, in the ErrorData struct, and then let emit_log_hook sort it
out (perhaps by constructing some JSON message, perhaps translating the
message_id, etc.).

I like the idea, but still playing with the ergonomics a bit, and how
it interacts with various message parts (msg, hint, detail, etc.). If
we had the name-based format strings, then the message parts could all
share a common set of parameters; but with the current positional
format strings, I think each message part would need its own set of
arguments.

Positional:

ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg_params("duplicate key value violates unique
constraint \"%s\"",
errparam("constraintname", MSGDEFAULT,
RelationGetRelationName(rel)),
errdetail_params("Key %s already exists.",
errparam("key", MSGUSERDATA, key_desc)))
);

Named:

ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg_rich("duplicate key value violates unique constraint
\"%{constraintname}s\""),
errdetail_rich("Key %{key}s already exists."),
errparam("key", MSGUSERDATA, key_desc))
errparam("constraintname", MSGDEFAULT,
RelationGetRelationName(rel)))
);

I think either one needs some ergonomic improvements, but it seems like
we are going in the right direction.

Maybe we can make the parameters common to different message parts by
using an integer index to reference the parameter, like:

ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg_rich("duplicate key value violates unique constraint
\"%s\"", 1 /* second errparam */),
errdetail_rich("Key %s already exists.", 0 /* first
errparam */),
errparam("key", MSGUSERDATA, key_desc))
errparam("constraintname", MSGDEFAULT,
RelationGetRelationName(rel)))
);

Not quite ideal, but might get us closer.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2019-08-05 21:32:36 Re: Redacting information from logs
Previous Message Andres Freund 2019-08-05 21:10:26 Re: Redacting information from logs