Re: [PATCH] New command to monitor progression of long running queries

From: Remi Colinet <remi(dot)colinet(at)gmail(dot)com>
To: Maksim Milyutin <m(dot)milyutin(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] New command to monitor progression of long running queries
Date: 2017-04-18 14:39:15
Message-ID: CADdR5nxTfpAgKrQ0o+cGHM1hkqBRgU-n-6RjAD1bKQzc9ck=QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Maksim,

The core implementation I suggested for the new PROGRESS command uses
different functions from the one used by EXPLAIN for its core
implementation.
Some source code is shared with EXPLAIN command. But this shared code is
only related to quals, properties, children, subPlans and few other nodes.

All other code for PROGRESS is new code.

I don't believe explain.c code can be fully shared with the one of the new
PROGRESS command. These 2 commands have different purposes.
The core code used for the new PROGRESS command is very different from the
core code used for EXPLAIN.

I only extracted some common code from explain.c and put it in report.c
which is used by progress.c.This code is valid for Plan and PlanState.

The code shared is:

ReportPreScanNode() renamed from ExplainPreScanNode()
ReportBeginOutput() renamed from ExplainBeginOutput()
ReportEndOutput() renamed from ExplainEndOutput()
ReportOpenGroup() ...
ReportProperties() ...
ReportPropertyText() ...
ReportHasChildren() ...
ReportSubPlans() ...
ReportMemberNodes() ...
ReportCustomChildren() ...
ReportCloseGroup() ...

ExplainState has been renamed ReportState.

Regarding the queryDesc state of SQL query upon receiving a request to
report its execution progress, it does not bring any issue. The request is
noted when the signal is received by the monitored backend. Then, the
backend continues its execution code path. When interrupts are checked in
the executor code, the request will be dealt.

When the request is being dealt, the monitored backend will stop its
execution and report the progress of the SQL query. Whatever is the status
of the SQL query, progress.c code checks the status and report either that
the SQL query does not have a valid status, or otherwise the current
execution state of the SQL query.

SQL query status checking is about:
- idle transaction
- out of transaction status
- null planned statement
- utility command
- self monitoring

Other tests can be added if needed to exclude some SQL query state. Such
checking is done in void HandleProgressRequest(void).
I do not see why a SQL query progression would not be possible in this
context. Even when the queryDescc is NULL, we can just report a <idle
transaction> output. This is currently the case with the patch suggested.

So far, I've found this new command very handy. It allows to evaluate the
time needed to complete a SQL query.

A further improvement would be to report the memory, disk and time
resources used by the monitored backend. An overuse of memory, disk and
time resources can prevent the SQL query to complete.

Best regards
Remi

2017-04-18 15:00 GMT+02:00 Maksim Milyutin <m(dot)milyutin(at)postgrespro(dot)ru>:

> Hi!
>
>
> On 17.04.2017 15:09, Remi Colinet wrote:
>
>> Hello,
>>
>> I've implemented a new command named PROGRESS to monitor progression of
>> long running SQL queries in a backend process.
>>
>>
>> Use case
>> =======
>>
>> A use case is shown in the below example based on a table named t_10m
>> with 10 millions rows.
>>
>> The table has been created with :
>>
>> CREATE TABLE T_10M ( id integer, md5 text);
>> INSERT INTO T_10M SELECT generate_series(1,10000000) AS id,
>> md5(random()::text) AS md5;
>>
>> 1/ Start a first psql session to run long SQL queries:
>>
>> [pgadm(at)rco ~]$ psql -A -d test
>> psql (10devel)
>> Type "help" for help.
>> test=#
>>
>> The option -A is used to allow rows to be output without formatting work.
>>
>> Redirect output to a file in order to let the query run without terminal
>> interaction:
>> test=# \o out
>>
>> Start a long running query:
>> test=# select * from t_10M order by md5;
>>
>> 2/ In a second psql session, list the backend pid and their SQL query
>>
>> [pgadm(at)rco ~]$ psql -d test
>> psql (10devel)
>> Type "help" for help.
>>
>> test=# select pid, query from pg_stat_activity ;
>> pid | query
>> -------+-------------------------------------------
>> 19081 |
>> 19084 |
>> 19339 | select pid, query from pg_stat_activity ;
>> 19341 | select * from t_10m order by md5;
>> 19727 | select * from t_10m order by md5;
>> 19726 | select * from t_10m order by md5;
>> 19079 |
>> 19078 |
>> 19080 |
>> (9 rows)
>>
>> test=#
>>
>> Chose the pid of the backend running the long SQL query to be monitored.
>> Above example is a parallel SQL query. Lowest pid is the main backend of
>> the query.
>>
>> test=# PROGRESS 19341;
>> PLAN
>> PROGRESS
>> ------------------------------------------------------------
>> -------------------------------
>> Gather Merge
>> -> Sort=> dumping tuples to tapes
>> rows r/w merge 0/0 rows r/w effective 0/2722972 0%
>> Sort Key: md5
>> -> Parallel Seq Scan on t_10m => rows 2751606/3954135 69% blks
>> 125938/161222 78%
>> (5 rows)
>>
>> test=#
>>
>> The query of the monitored backend is:
>> test=# select * from t_10M order by md5;
>>
>> Because the table has 10 millions of rows, the sort is done on tapes.
>>
>>
>> Design of the command
>> =================
>>
>> The design of the patch/command is:
>> - the user issue the "PROGRESS pid" command from a psql session. The pid
>> is the one of the backend which runs the SQL query for which we want to
>> get a progression report. It can be determined from the view
>> pg_stat_activity.
>> - the monitoring backend, upon receiving the "PROGRESS pid" command from
>> psql utility used in step above, sends a signal to the backend whose
>> process pid is the one provided in the PROGRESS command.
>> - the monitored backend receives the signal and notes the request as for
>> any interrupt. Then, it continues its execution of its SQL query until
>> interrupts can be serviced.
>> - when the monitored process can service the interrupts, it deals with
>> the progress request by collecting its execution tree with the execution
>> progress of each long running node. At this time, the SQL query is no
>> more running. The progression of each node is calculated during the
>> execution of the SQL query which is at this moment stopped. The
>> execution tree is dumped in shared memory pages allocated at the start
>> of the server. Then, the monitored backend set a latch on which the
>> monitoring process is waiting for. It then continues executing its SQL
>> query.
>> - the monitoring backend collects the share memory data dumped by the
>> monitored backed, and sends it to its psql session, as a list of rows.
>>
>> The command PROGRESS does not incur any slowness on the running query
>> because the execution progress is only computed upon receiving the
>> progress request which is supposed to be seldom used.
>>
>> The code heavily reuses the one of the explain command. In order to
>> share as much code as possible with the EXPLAIN command, part of the
>> EXPLAIN code which deals with reporting quals for instance, has been
>> moved to a new report.c file in the src/backend/commands folder. This
>> code in report.c is shared between explain.c source code and PROGRESS
>> command source code which is in progress.c file.
>>
>> The progression reported by PROGRESS command is given in terms of rows,
>> blocks, bytes and percents. The values displayed depend on the node type
>> in the execution plan.
>>
>> The current patch implements all the possible nodes which could take a
>> lot of time:
>> - Sequential scan nodes with rows and block progress (node type
>> T_SeqScan, T_SampleScan, T_BitmapHeaepScan, T_SubqueryScan,
>> T_FunctionScan, T_ValuesScan, T_CteScan, T_WorkTableScan)
>> - Tuple id scan node with rows and blocks progress (T_TidScan)
>> - Limit node with rows progress (T_Limit)
>> - Foreign and custom scan with rows and blocks progress (T_ForeignScan,
>> T_CustomScan)
>> - Index scan, index only scan and bitmap index scan with rows and blocks
>> progress
>>
>>
>> Use cases
>> ========
>>
>> Some further examples of use are shown below in the test_v1.txt file.
>>
>>
>> What do you make of this idea/patch?
>>
>> Does it make sense?
>>
>> Any suggestion is welcome.
>>
>> The current patch is still work in progress. It is meanwhile stable. It
>> can be used with regular queries. Utilities commands are not supported
>> for the moment.
>> Documentation is not yet written.
>>
>> Regards
>> Remi
>>
>>
> I had implemented analogous feature as extension *pg_query_state* [1] the
> idea of which I proposed in the thread [2]. Together with this extension I
> provided some patches to postgres core to enable to send custom signals to
> working backend (similar to your PROCSIG_PROGRESS) and to print the current
> query state through patches in 'ExplainNode' function.
>
> I had implemented the same mechanics as you:
> 1) interrupt the working backend through ProcSignal;
> 2) handle progress request in the CHECK_FOR_INTERRUPTS entry;
> 3) transfer query state through shared memory to caller.
> But criticism of my approach was that the structure 'QueryDesc' on basis
> of which query state is formed can be inconsistent in the places where
> CHECK_FOR_INTERRUPTS appears [3].
>
> I plan to propose the custom_signal patch to community as soon as possible
> and as consequence release *pg_query_state* from dependency on patches to
> postgres core. In perspective, I want to resolve the problem related to
> binding to the CHECK_FOR_INTERRUPTS entries perhaps through patching the
> executor and implement the robust PROGRESS command.
>
>
> 1. https://github.com/postgrespro/pg_query_state
> 2. https://www.postgresql.org/message-id/dbfb1a42-ee58-88fd-8d7
> 7-550498f52bc5%40postgrespro.ru
> 3. https://www.postgresql.org/message-id/24182.1472745492%40sss.pgh.pa.us
>
>
> --
> Maksim Milyutin
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2017-04-18 14:42:56 Re: scram and \password
Previous Message Masahiko Sawada 2017-04-18 14:28:36 Re: pgsql: Update copyright for 2017