From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Cancelling parallel query leads to segfault |
Date: | 2018-02-14 18:56:51 |
Message-ID: | 20180214185651.277g7o3xdzys624d@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2018-02-12 15:43:49 -0500, Peter Eisentraut wrote:
> On 2/6/18 12:06, Andres Freund wrote:
> > On 2018-02-06 12:01:08 -0500, Peter Eisentraut wrote:
> >> On 2/1/18 20:35, Andres Freund wrote:
> >>> On February 1, 2018 11:13:06 PM GMT+01:00, Peter Eisentraut
> >>> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> >>>> Here is a patch to implement that idea. Do you have a way to test it
> >>>> repeatedly, or do you just randomly cancel queries?
> >>>
> >>> For me cancelling the long running parallel queries I tried reliably
> >>> triggers the issue. I encountered it while cancelling tpch q1 during JIT
> >>> work.
> >>
> >> Why does canceling a query result in elog(FATAL)? It should just be
> >> elog(ERROR), which wouldn't trigger this issue.
> >
> > The workers are shut down.
>
> I have used the setup mentioned in
> <https://www.postgresql.org/message-id/6a909374-2602-7136-8c70-397330a418f3%402ndquadrant.com>
> to reproduce this, without success. I have tried statement_timeout and
> manual cancels. Any other ideas?
>
> I don't doubt that the issue exists, but it would be nice to be able to
> reproduce it.
With your example I can reliably trigger the issue if I shut down the
server while the query is running:
^C2018-02-14 10:54:06.786 PST [22261][] LOG: received fast shutdown request
2018-02-14 10:54:06.786 PST [22261][] LOG: aborting any active transactions
2018-02-14 10:54:06.786 PST [22275][4/3] FATAL: terminating connection due to administrator command
2018-02-14 10:54:06.786 PST [22275][4/3] STATEMENT: select from t1 where a = 55;
2018-02-14 10:54:06.786 PST [22274][5/3] FATAL: terminating connection due to administrator command
2018-02-14 10:54:06.786 PST [22274][5/3] STATEMENT: select from t1 where a = 55;
2018-02-14 10:54:06.786 PST [22271][3/2] FATAL: terminating connection due to administrator command
2018-02-14 10:54:06.786 PST [22271][3/2] STATEMENT: select from t1 where a = 55;
2018-02-14 10:54:06.787 PST [22261][] LOG: background worker "logical replication launcher" (PID 22268) exited with exit code 1
2018-02-14 10:54:06.787 PST [22261][] LOG: background worker "parallel worker" (PID 22274) exited with exit code 1
2018-02-14 10:54:06.787 PST [22261][] LOG: background worker "parallel worker" (PID 22275) exited with exit code 1
2018-02-14 10:54:06.788 PST [22261][] LOG: server process (PID 22271) was terminated by signal 11: Segmentation fault
2018-02-14 10:54:06.788 PST [22261][] DETAIL: Failed process was running: select from t1 where a = 55;
2018-02-14 10:54:06.788 PST [22261][] LOG: terminating any other active server processes
2018-02-14 10:54:06.789 PST [22285][] FATAL: the database system is shutting down
2018-02-14 10:54:06.789 PST [22261][] LOG: abnormal database system shutdown
2018-02-14 10:54:06.790 PST [22261][] LOG: database system is shut down
but only if I don't use EXPLAIN ANALYZE. Not quite sure what that is
about.
Your patch appears to fix the issue.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | rqtx | 2018-02-14 19:45:30 | [HACKERS] Inserting data into a new catalog table via source code |
Previous Message | Andres Freund | 2018-02-14 18:35:08 | Re: [COMMITTERS] pgsql: Rearm statement_timeout after each executed query. |