Re: when the startup process doesn't (logging startup delays)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: when the startup process doesn't (logging startup delays)
Date: 2021-07-26 17:11:31
Message-ID: CA+TgmoaP2wEFSktmCgwT9LXuz7Y99HNdUYshpk7qNFuQB98g6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 26, 2021 at 11:30 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > Maybe I'm missing something here, but I don't understand the purpose
> > of this. You can always combine two functions into one, but it's only
> > worth doing if you end up with less code, which doesn't seem to be the
> > case here.
>
> 4 files changed, 39 insertions(+), 71 deletions(-)

Hmm. I don't completely hate that, but I don't love it either. I don't
think the result is any easier to understand than the original, and in
some ways it's worse. In particular, you've flattened the separate
comments for each of the individual functions into a single-line
comment that is more generic than the comments it replaced. You could
fix that and you'd still be slightly ahead on LOC, but I'm not
convinced that it's actually a real win.

> > ... but I'm not exactly sure how to get there from here. Having only
> > LogStartupProgress() but having it do a giant if-statement to figure
> > out whether we're mid-phase or end-of-phase does not seem like the
> > right approach.
>
> I used a bool arg and negation to handle within a single switch. Maybe it's
> cleaner to use a separate enum value for each DONE, and set a local done flag.

If we're going to go the route of combining the functions, I agree
that a Boolean is the way to go; I think we have some existing
precedent for 'bool finished' rather than 'bool done'.

But I kind of wonder if we should have an enum in the first place. It
feels like we've got code in a bunch of places that just exists to
decide which enum value to use, and then code someplace else that
turns around and decides which message to produce. That would be
sensible if we were using the same enum values in lots of places, but
that's not the case here. So suppose we just moved the messages to the
places where we're now calling LogStartupProgress() or
LogStartupProgressComplete()? So something like this:

if (should_report_startup_progress())
ereport(LOG,
(errmsg("syncing data directory (syncfs), elapsed
time: %ld.%02d s, current path: %s",
secs, (usecs / 10000), path)));

Well, that doesn't quite work, because "secs" and "usecs" aren't going
to exist in the caller. We can fix that easily enough: let's just make
should_report_startup_progress take arguments long *secs, int *usecs,
and bool done. Then the above becomes:

if (should_report_startup_progress(&secs, &usecs, false))
ereport(LOG,
(errmsg("syncing data directory (syncfs), elapsed
time: %ld.%02d s, current path: %s",
secs, (usecs / 10000), path)));

And if this were the call site that corresponds to
LogStartupProgressComplete(), we'd replace false with true. Now, the
only real disadvantage of this that I can see is that it requires the
caller to declare 'secs' and 'usecs', which is not a big deal, but
mildly annoying perhaps. I think we can do better still with a little
macro magic. Suppose we define a macro report_startup_progress(force,
msg, ...) that expands to:

{
long secs;
int usecs;
if (startup_progress_timer_expired(&secs, &usecs, force))
ereport(LOG, errmsg(msg, secs, usecs, ...));
}

Then the above just becomes this:

report_startup_progress(false, "syncing data directory (syncfs),
elapsed time: %ld.%02d s, current path: %s", path);

This would have the advantage that the strings we're using would be
present in the code that arranges to emit them, instead of being
removed to some other module, so I think it would be clearer. It would
also have the advantage of making it much easier to add further calls,
if someone feels they want to do that. You don't have to run around
and update enums and all the various things that use them, just copy
and paste the line above and adjust as required.

With this design, we avoid a lot of "action at a distance". We don't
define the message strings in a place far-removed from the code that
wants to emit them any more. When someone wants a new progress
message, they can just add another call to report_statup_progress()
wherever it needs to go and they're done. They don't have to go run
and update the enum and various switch statements. They're just done.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-07-26 17:14:20 Re: Rename of triggers for partitioned tables
Previous Message Bossart, Nathan 2021-07-26 17:08:52 Re: Avoiding data loss with synchronous replication