From: | Joe Abbate <jma(at)freedomcircle(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | checkpointer and other server processes crashing |
Date: | 2021-02-15 21:15:44 |
Message-ID: | e426330d-bdbb-98d8-5e74-09998258503d@freedomcircle.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
We've been experiencing PG server process crashes about every other week
on a mostly read only website (except for a single insert/update on page
access). Typical log entries look like
LOG: checkpointer process (PID 11200) was terminated by signal 9: Killed
LOG: terminating any other active server processes
Other than the checkpointer, the server process that was terminated was
either doing a "BEGIN READ WRITE", a "COMMIT" or executing a specific
SELECT.
The database is always recovered within a second and everything else
appears to resume normally. We're not certain about what triggers this,
but in several instances the web logs show an external bot issuing
multiple HEAD requests on what is logically a single page. The web
server logs show "broken pipe" and EOF errors, and PG logs sometimes
shows a number of "incomplete startup packet" messages before the
termination message.
This started roughly when the site was migrated to Go, whose web
"processes" run as "goroutines", scheduled by Go's runtime (previously
the site used Python and Gunicorn to serve the pages, which probably
isolated the PG processes from a barrage of nearly simultaneous requests).
As I understand it, the PG server processes doing a SELECT are spawned
as children of the Go process, so presumably if a "goroutine" dies, the
associated PG process would die too, but I'm not sure I grasp why that
would cause a recovery/restart. I also don't understand where the
checkpointer process fits in the picture (and what would cause it to die).
For the record, this is on PG 11.9 running on Debian.
TIA,
Joe
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2021-02-15 21:29:28 | Re: checkpointer and other server processes crashing |
Previous Message | Tom Lane | 2021-02-15 20:55:10 | Re: pg_stat_user_tables.n_mod_since_analyze persistence? |