From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Make new crash restart test a bit more robust. |
Date: | 2017-09-19 20:46:58 |
Message-ID: | 1136.1505854018@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> So this is geniuinely interesting. When the machine is really loaded (as
> in 6 animals running on a vm at the same time, incuding valgrind), psql
> sometimes doesn't get the WARNING message from a shutdown. Instead it
> gets
> # psql:<stdin>:3: server closed the connection unexpectedly
> # This probably means the server terminated abnormally
> # before or while processing the request.
> # psql:<stdin>:3: connection to server was lost
That seems pretty weird. Maybe it's not the same case, but in
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2017-09-19%2020%3A10%3A02
you can see from the postmaster log that the backend *is* issuing
the message, or at least it's getting to the server log:
2017-09-19 20:20:34.476 UTC [6363] [unknown] LOG: connection received: host=[local]
2017-09-19 20:20:34.477 UTC [6363] [unknown] LOG: connection authorized: user=andres database=postgres
2017-09-19 20:20:34.478 UTC [6363] t/013_crash_restart.pl LOG: statement: SELECT $$psql-connected$$;
...
2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl WARNING: terminating connection because of crash of another server process
2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl HINT: In a moment you should be able to reconnect to the database and repeat your command.
Have we forgotten an fflush() or something?
Also, maybe problem is on client side. I vaguely recall a libpq bug
wherein it would complain about socket EOF even though data remained
to be processed. Maybe we reintroduced something like that?
> We can obviously easily make the test accept both - but are we ok with
> the client sometimes not getting the message?
I'm not ...
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-09-19 20:53:09 | Re: pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to |
Previous Message | Andres Freund | 2017-09-19 20:32:39 | Re: pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to |
From | Date | Subject | |
---|---|---|---|
Next Message | Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?= | 2017-09-19 20:50:26 | Re: Show backtrace when tap tests fail |
Previous Message | Andres Freund | 2017-09-19 20:42:56 | Re: Show backtrace when tap tests fail |