From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Subject: | Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors |
Date: | 2018-08-17 07:49:08 |
Message-ID: | alpine.DEB.2.21.1808170917510.20841@lancre |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Marina,
>> Detailed -r report. I understand from the doc that the retry number on
>> the detailed per-statement report is to identify at what point errors
>> occur? Probably this is more or less always at the same point on a
>> given script, so that the most interesting feature is to report the
>> number of retries at the script level.
>
> This may depend on various factors.. for example:
> [...]
> 21.239 5 36 UPDATE xy3 SET y = y + :delta WHERE x
> = :x3;
> 21.360 5 39 UPDATE xy2 SET y = y + :delta WHERE x
> = :x2;
Ok, not always the same point, and you confirm that it identifies where
the error is raised which leads to a retry.
> And you can always get the number of retries at the script level from the
> main report (if only one script is used) or from the report for each script
> (if multiple scripts are used).
Ok.
>> ISTM that "skipped" transactions are NOT "successful" so there are a
>> problem with comments. I believe that your formula are probably right,
>> it has more to do with what is "success". For cnt decomposition, ISTM
>> that "other transactions" are really "directly successful
>> transactions".
>
> I agree with you, but I also think that skipped transactions should not be
> considered errors.
I'm ok with having a special category for them in the explanations, which
is neither success nor error.
> So we can write something like this:
> All the transactions are divided into several types depending on their
> execution. Firstly, they can be divided into transactions that we started to
> execute, and transactions which were skipped (it was too late to execute
> them). Secondly, running transactions fall into 2 main types: is there any
> command that got a failure during the last execution of the transaction
> script or not? Thus
Here is an attempt at having a more precise and shorter version, not sure
it is much better than yours, though:
"""
Transactions are counted depending on their execution and outcome. First
a transaction may have started or not: skipped transactions occur under
--rate and --latency-limit when the client is too late to execute them.
Secondly, a started transaction may ultimately succeed or fail on some
error, possibly after some retries when --max-tries is not one. Thus
"""
> the number of all transactions =
> skipped (it was too late to execute them)
> cnt (the number of successful transactions) +
> ecnt (the number of failed transactions).
>
> A successful transaction can have several unsuccessful tries before a
> successfull run. Thus
>
> cnt (the number of successful transactions) =
> retried (they got a serialization or a deadlock failure(s), but were
> successfully retried from the very beginning) +
> directly successfull transactions (they were successfully completed on
> the first try).
These above description is clearer for me.
>> I'd suggest to put "ANOTHER_SQL_FAILURE" as the last option, otherwise
>> "another" does not make sense yet.
>
> Maybe firstly put a general group, and then special cases?...
I understand it more as a catch all default "none of the above" case.
>> In TState, field "uint32 retries": maybe it would be simpler to count
>> "tries", which can be compared directly to max tries set in the option?
>
> If you mean retries in CState - on the one hand, yes, on the other hand,
> statistics always use the number of retries...
Ok.
>> The automaton skips to FAILURE on every possible error. I'm wondering
>> whether it could do so only on SQL errors, because other fails will
>> lead to ABORTED anyway? If there is no good reason to skip to FAILURE
>> from some errors, I'd suggest to keep the previous behavior. Maybe the
>> good reason is to do some counting, but this means that on eg
>> metacommand errors now the script would loop over instead of aborting,
>> which does not look like a desirable change of behavior.
>
> Even in the case of meta command errors we must prepare for CSTATE_END_TX and
> the execution of the next script: if necessary, clear the conditional stack
> and rollback the current transaction block.
Seems ok.
>> commandFailed: I'm not thrilled by the added boolean, which is partially
>> redundant with the second argument.
>
> Do you mean that it is partially redundant with the argument "cmd" and, for
> example, the meta commands errors always do not cause the abortions of the
> client?
Yes. And also I'm not sure we should want this boolean at all.
> [...]
> If in such cases one command is placed on several lines, ISTM that the code
> is more understandable if curly brackets are used...
Hmmm. Such basic style changes are avoided because they break
backpatching, so we try to avoid gratuitous changes unless there is a
strong added value, which does not seem to be the case here.
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Paul Bonaud | 2018-08-17 07:57:08 | Re: Doc patch: pg_upgrade page and checkpoint location consistency with replicas |
Previous Message | Magnus Hagander | 2018-08-17 07:48:26 | Re: Problem with OpenSCG downloads |