From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>, pavel(dot)stehule(at)gmail(dot)com |
Subject: | Re: pgbench progress report improvements |
Date: | 2013-09-21 20:40:00 |
Message-ID: | 20130921204000.GA433915@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Sep 21, 2013 at 10:55:54AM +0200, Fabien COELHO wrote:
>>> Improve pgbench measurements & progress report
>>
>> These changes are loosely coupled; please separate them into several patch
>> files:
>
> I thought about this. I submitted a bunch of very small pgbench patches
> to the previous commit fest, and it resulted in a lot of duplicated tests
> by reviewers, which made it a waste of their precious time.
>
> ISTM that what you suggest will result in more reviewer time, which is
> the critical resource of the commitfest.
>
> Moreover, splitting the patch as you suggest, especially in 7 patches,
> would result in conflicting or dependent patches, which creates yet more
> issues for the reviews. There is *some* coupling, because the reports are
> about the measurements performed, and I change both to try to have
> something consistent in the end. I can separate the doc fix, but this is
> just one line...
Post all the patch files to the same mailing list thread and have them
reviewed under a single CommitFest entry. A reviewer or committer preferring
less granularity can squash patches together effortlessly. The inverse task,
splitting a patch, is not so mechanical. Consequently, err on the side of
more, smaller patches; it almost always saves reviewer and committer time.
> Basically, I tried to optimize reviewer time by a combined patch focussed
> on pgbench measurements and their reporting.
Relatedness of changes is a continuum, and committers sometimes disagree on
how far to subdivide before commit. This patch is not a borderline case;
removing a bias from --rate throttle delays does not belong in the same commit
as enabling progress reports by default.
> But this does not prevent discussing features separately!
Actually, it does hamper debate. With smaller patches, a person interested in
just one issue can read just that patch. With a monolithic patch, one must
take in the entire thing to fully comprehend any single issue.
>> Patch (1): Change the default from --progress=0 to --progress=5
>>
>> This has a disadvantage of causing two extra gettimeofday() calls per
>> transaction. That's not cheap on all platforms; a user comparing
>> pgbench results across versions will need to make a point of disabling
>> this again to make his results comparable. Given that threat and
>> uncertainty about which default would be more popular, I think we
>> should drop this part.
>
> The gettimeofday call time is very small compared to network and database
> (disk!) accesses involved in a pgbench run. On my small laptop,
> gettimeofday takes about 0.00000002 seconds (0.02 µs -- micro seconds)
> per call, which is typically under 1/1000 of the fastest local-socket
> read-only one-table cache-hit prepared trivial transaction.
Many systems do have cheap gettimeofday(). Many don't. We recently added
contrib/pg_test_timing in response to the variability in this area.
> I found that running pgbench to get significant results requires long
> minutes or even hours because of warmup time. Running a command for 30
> minutes without any feedback is annoying. The second point is that the
> feedback would help user notice that the figures evolve slowly but
> significantly, and realise that their too short runs are not significant
> at all. So it seems to me that it should be the desirable behavior, and
> --progress=0 is always possible anyway for the very performance
> (over-)conscious tester.
Those benefits aren't compelling enough to counterbalance the risk of
gettimeofday() overhead affecting results. (Other opinions welcome.) For a
tool like pgbench that requires considerable skill to use effectively,
changing a default only helps slightly. It doesn't take much of a risk to
make us better off leaving the default unchanged.
>> Patch (2): Make --initialize mode respect --progress.
>>
>> The definition you've chosen for --quiet makes that option contrary to its own
>> name: message-per-100k-tuples is typically more voluminous than your proposed
>> new default of message-per-5s. In fact, since --quiet currently switches from
>> message-per-100k-tuples to message-per-5s, your patch gives it the exact
>> opposite of its present effect.
>>
>> During the 9.3 development cycle, we _twice_ made changes pertaining to
>> --initialize message frequency:
>>
>> http://www.postgresql.org/message-id/flat/20120620(dot)170427(dot)347012755716399568(dot)t-ishii(at)sraoss(dot)co(dot)jp
>> http://www.postgresql.org/message-id/flat/1346472039(dot)18010(dot)10(dot)camel(at)vanquo(dot)pezone(dot)net
>>
>> That gives me pause about working through yet another change; we keep burning
>> time on this minor issue.
>
> I totally agree that this "quiet" is not convincing!
>
> My actual opinion is that quiet should just mean quiet:-), i.e. no
> progress report.
>
> I tried to preserve the row-counting behavior because I thought that
> someone would object otherwise, but I would be really in favor of
> dropping the row-counting report behavior altogether and only keep the
> time triggered report.
I would be fine with dropping the row-counting behavior. But why subject this
distant side issue to its third round of bikeshedding in 18 months?
>> Patch (4): Redefine "latency" as reported by pgbench and report "lag" more.
>>
>> If we do this, should we also augment the --log output to contain the figures
>> necessary to reconstruct this calculation? I think that would mean adding
>> fields for the time when the first command of the transaction was sent.
>
> Yep, I thought about it, but decided against for the following reason:
> the lag measure is only significant under throttling, which is not the
> typical use case for pgbench. When there is no throttling, there is no
> lag, so nothing to report. Also, having two formats would mean that
> existing tool that analyse these logs would have to deal with both.
Fair enough.
>>> - Take thread start time at the beginning of the thread.
>>>
>>> Otherwise it includes pretty slow thread/fork system start times in
>>> the measurements. May help with bug #8326.
>>
>> Patch (5)
>>
>> That theory is sound, but I would like at least one report reproducing that
>> behavior and finding that this change improved it.
>
> It was really annoying when I was debugging pgbench, because I did not
> know whether flukes I was seeing were due to this issue or elsewhere.
> Having the start time taken when the test really starts help rule out
> this as a potential issue for strange figures on short runs.
>
> The bug report I mentionned was a pretty short test with a *lot* of
> threads which took possibly several seconds to be started, but I cannot
> run that on my small laptop.
I can reproduce the report on low-spec machines, including a 1-GiB VM with "-c
100 -j 100" instead of "-c 800 -j 100". Your patch does not dramatically
change the spread between "including connections establishing" and "excluding
connections establishing". Thinking about it, that's unsurprising. The
thrust of this change is to not count client thread creation at all, as
opposed to counting it as part of connection establishment. Establishing
database connections is more expensive than creating threads, so it will
remain the dominant cost. Whether thread creation should be accounted
therewith seems arbitrary, so I'm inclined to leave it alone.
I'll post some further detail to the pgsql-bugs thread.
>>> - Updated documentation & help message.
>>
>> Include any relevant documentation and --help updates with the patch
>> necessitating them. If there are errors in the documentation today, put fixes
>> for those in a separate patch (7).
>
> I could do a separate patch for a one-liner doc update for homogeneity...
That's how we often do things. Thanks.
>> Additionally, this patch contains numerous whitespace-only changes.
>
> Hmmm. This is emacs fault:-) I'll (try) to remove them.
Patch credit: "Emacs, with some tweaks by Fabien".
>>> Sample output under benchmarking with --progress=1
>>
>>> progress: 36.0 s, 10.9 tps, 91.864 +- 124.874 ms lat
>>> progress: 37.0 s, 115.2 tps, 8.678 +- 1.792 ms lat
>>
>> "x +- y" does not convey that y is a standard deviation.
>
> I'm refering to the ± sign, but in ascii because I doubt that I can use a
> UTF character:-) From http://en.wikipedia.org/wiki/Plus-minus_sign:
>
> "In experimental sciences, the sign commonly indicates the confidence
> interval or error in a measurement, often the standard deviation or
> standard error."
>
> ISTM that "pgbench" is experimental science.
Use of +-y as an abbreviation for σ=y seems defensible when the distribution
has less skew. The "91.864 +- 124.874" quoted above need not imply negative
latency measurements, yet a plain reading of +- would imply just that. (I
rarely read work in experimental science fields, so this objection could be
out of touch with widespread practice.)
>> I suggest getting the term "stddev" in there somehow, maybe like this:
>>
>> progress: 37.0 s, 115.2 tps, latency avg=8.678 ms stddev=1.792
>
> My issue is to try to keep the line width under control so as to avoid
> line breaks in the terminal. Under throttling, there is also the time lag
> appended to the line.
>
> Moreover, using 'xxx=figure" breaks simple "cut" pipelining to extract
> the figures, so I would prefer to stick to spaces.
>
> Maybe:
>
> progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms
Works for me.
>>> - int64 wait = (int64)
>>> - throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
>>> + int64 wait = (int64) (throttle_delay *
>>> + 1.00055271703 * -log(getrand(thread, 1, 10000)/10000.0));
>>
>> This quantizes the random number, then dequantizes it, then corrects for the
>> resolution lost in the round trip.
>
> Yep!
>
>> Shouldn't it just be:
>>
>> int64 wait = (int64) (throttle_delay *
>> -log(1 - pg_erand48(thread->random_state)));
>
> It could. However with this approach the inserted delay would be pretty
> much unbounded. The benefit of using 10000 is that the delay is bounded
> by about 10 times the expected average rate, so although it is throttled
> by a stochastic process, it stays quite active and steady. Basically
> using the quantization seemed a good compromise to me...
Ah; that makes sense. Then I understand why you want to remove the bias, but
why do you also increase the upper bound?
>>> - if ((!use_quiet) && (j % 100000 == 0))
>>> + if (!progress && (j % 100000 == 0))
>>
>> Project style is "progress == 0" rather than "!progress".
>
> I could. I used "!progress" because "progress == 0" is interpreted as "no
> progress", and not "progress with value zero", which would mean a report
> at every possible time.
That's a good argument for doing it the way you did it.
>>> @@ -2143,8 +2174,15 @@ printResults(int ttype, int normal_xacts, int nclients,
>>> [...]
>> This output only happens when using --rate. That has made sense
>> heretofore, because in the absence of rate limiting, average latency
>> can be trivially calculated from tps. Since we will now track stddev as
>> well, let's print something like this even when --rate is not in use.
>
> It is also printed without --rate. There is a "if" above because there is
> one report with "lag" (under --rate), and one without.
The code I quoted is for the final report in printResults(), and that only
shows latency mean/stddev when using --rate. The progress reporting in
threadRun() does have two variations as you describe.
>>> - float tps, total_run, latency;
>>> + double tps, total_run, latency, sqlat, stdev, lag;
>>
>> Why these variable type changes?
>
> Because C computes stuff in double anyway (all maths functions such as
> sqrt above are about double), and it was an oversight from my part to
> have use float in the first place in the previous patch. Thus I fixed it.
Okay.
--
Noah Misch
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-09-21 20:46:14 | Re: SSI freezing bug |
Previous Message | Josh Berkus | 2013-09-21 18:50:10 | VMs for Reviewers Available |