From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Improving the isolationtester: fewer failures, less delay |
Date: | 2021-06-16 01:18:20 |
Message-ID: | 20210616011820.2ktysyfozfhqdtha@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Only halfway related: I wonder if we should remove the automatic
permutation stuff - it's practically never useful. Probably not worth
changing...
On 2021-06-15 17:09:00 -0400, Tom Lane wrote:
> +The general form of a permutation entry is
> +
> + "step_name" [ ( marker [ , marker ... ] ) ]
> +
> +where each marker defines a "blocking condition". The step will not be
> +reported as completed before all the blocking conditions are satisfied.
Minor suggestion: I think the folliwing would be a bit easier to read if
there first were a list of markers, and then separately the longer
descriptions. Right now it's a bit hard to see which paragraph
introduces a new type of marker, and which just adds further commentary.
> + /*
> + * Check for other steps that have finished. We must do this
> + * if oldstep completed; while if it did not, we need to poll
> + * all the active steps in hopes of unblocking oldstep.
> + */
Somehow I found the second sentence a bit hard to parse - I think it's
the "while ..." sounding a bit odd to me.
> + /*
> + * If the target session is still busy, apply a timeout to
> + * keep from hanging indefinitely, which could happen with
> + * incorrect blocker annotations. Use the same 2 *
> + * max_step_wait limit as try_complete_step does for deciding
> + * to die. (We don't bother with trying to cancel anything,
> + * since it's unclear what to cancel in this case.)
> + */
> + if (iconn->active_step != NULL)
> + {
> + struct timeval current_time;
> + int64 td;
> +
> + gettimeofday(¤t_time, NULL);
> + td = (int64) current_time.tv_sec - (int64) start_time.tv_sec;
> + td *= USECS_PER_SEC;
> + td += (int64) current_time.tv_usec - (int64) start_time.tv_usec;
>+ if (td > 2 * max_step_wait)
> + {
> + fprintf(stderr, "step %s timed out after %d seconds\n",
> + iconn->active_step->name,
> + (int) (td / USECS_PER_SEC));
> + exit(1);
> + }
> + }
> + }
> }
It might be worth printing out the list of steps the being waited for
when reaching the timeout - it seems like it'd now be easier to end up
with a bit hard to debug specs. And one cannot necessarily look at
pg_locks or such anymore to debug em.
> @@ -833,6 +946,19 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
> }
> }
>
> + /*
> + * The step is done, but we won't report it as complete so long as there
> + * are blockers.
> + */
> + if (step_has_blocker(pstep))
> + {
> + if (!(flags & STEP_RETRY))
> + printf("step %s: %s <waiting ...>\n",
> + step->name, step->sql);
> + return true;
> + }
Might be a bug in my mental state machine: Will this work correctly for
PSB_ONCE, where we'll already returned before?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-06-16 01:22:51 | Re: Improving the isolationtester: fewer failures, less delay |
Previous Message | David Rowley | 2021-06-16 01:15:17 | Re: disfavoring unparameterized nested loops |