From: | Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com> |
---|---|
To: | PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Passing query string to workers |
Date: | 2017-01-11 06:12:08 |
Message-ID: | CAOGQiiMH_nOOGkxhbidnwfZ1n5pQayEzbE5iv9rO2oA8GfVj0Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello everybody,
Currently, query string is not passed to the workers and only master has
it. In the events, when multiple queries are running on a server and for
one query some worker crashes then it becomes quite burdensome to find the
query with the crashed worker, since on the worker crash no query is
displayed.
To fix this, I propose a patch wherein query string is passed to the
workers as well, hence, displayed when worker crashes.
Approach:
A token for query string is created in the shared memory, this token is
populated with the query string using the global string --
debug_query_string. Now, for each of the worker when
ExecGetParallelQueryDesc is called, we retrieve the query text from shared
memory and pass it to CreateQueryDesc.
Next, to ensure that query gets displayed at the time of crash,
BackendStatusArray needs to be populated correctly, specifically for our
purpose, activity needs to be filled with current query. For this I called
pgstat_report_activity in ParallelWorkerMain, with the query string, this
populates workers' tuples in system table -- pgstat_activity as well.
Previously, pgstat_report_activity was only called for master in
exec_simple_query, hence, for workers pgstat_activity remained null.
Results:
Here is an output for artificially created worker crash with and without
the patch.
Without the patch error report on worker crash:
LOG: worker process: parallel worker for PID 49739 (PID 49741) was
terminated by signal 11: Segmentation fault
Error report with the patch:
LOG: worker process: parallel worker for PID 51757 (PID 51758) was
terminated by signal 11: Segmentation fault
2017-01-11 11:10:27.630 IST [51742] DETAIL: Failed process was running:
explain analyse select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '119' day
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus
LIMIT 1;
Inputs of all sorts are encouraged.
--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/
Attachment | Content-Type | Size |
---|---|---|
pass_queryText_to_workers_v1.patch | application/octet-stream | 3.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2017-01-11 07:04:13 | Re: pg_restore accepts -j -1 |
Previous Message | Kyotaro HORIGUCHI | 2017-01-11 05:51:14 | Re: Floating point comparison inconsistencies of the geometric types |