From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>, pavel(dot)stehule(at)gmail(dot)com |
Subject: | Re: pgbench progress report improvements |
Date: | 2013-09-22 08:08:54 |
Message-ID: | alpine.DEB.2.02.1309220850080.18614@sto |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Noah,
Thanks for your answers and remarks,
[...]
I'll split some part of the patch where there is no coupling, but I do
not want to submit conflicting patches.
> Those benefits aren't compelling enough to counterbalance the risk of
> gettimeofday() overhead affecting results. (Other opinions welcome.)
Yep. If I really leave gettimeofday out, I cannot measure the stddev, so
I'll end up with:
--rate => gettimeofday, mean (& stddev) measured, because they
cannot be derived otherwise.
no --rate => no (or less) gettimeofday, mean computed from total time and
no stddev report because it cannot be computed.
That is annoying, because I do want the standard deviation, and this mean
one "if"s complexity here and there.
ISTM that when running under a target time, the (hopefully very small)
overhead is only one gettimeofday call, because the other one is taken
anyway to check whether the thread should stop.
Or I can add a yet another option, say --stddev, to ask for standard
deviation, which will imply the additional gettimeofday call...
> For a tool like pgbench that requires considerable skill to use
> effectively, changing a default only helps slightly. It doesn't take
> much of a risk to make us better off leaving the default unchanged.
I can put a 0 default... but even very experienced people will be bitten
over and over. Why should I care? ISTM that the default should be the
safe option, and experienced user can request "-quiet" if they want it.
>> [...]
>> I tried to preserve the row-counting behavior because I thought that
>> someone would object otherwise, but I would be really in favor of
>> dropping the row-counting report behavior altogether and only keep the
>> time triggered report.
>
> I would be fine with dropping the row-counting behavior. But why subject this
> distant side issue to its third round of bikeshedding in 18 months?
I was not involved:-)
The 100000 behavior is the initial & old version, and only applies to
initialization. Someone found it too verbose when scaling, and I agree, so
made a quick patch which preserves the old behavior (someone must have
said: whatever, do not change the default!) but allowed to switch to a
less noisy version, hence the -quiet which is not really quiet. This would
be the defective result of a compromise:-)
If I follow your request not to change any default, I cannot merge cleanly
the -i & bench behaviors, as currenty -i does have a default progress
report and its own -quiet, and benchmarking does not.
The current behavior is inconsistent. I would prefer something consistent,
preferably always show a progress report, and -quiet hides it (fully), or
if you really insist no progress report, and --progress shows it, and the
-quiet option is removed.
>>>> - Take thread start time at the beginning of the thread.
>>>
>>> That theory is sound, but I would like at least one report reproducing that
>>> behavior and finding that this change improved it.
>
> [...] so I'm inclined to leave it alone.
I really spent *hours* debugging stupid measures on the previous round of
pgbench changes, when adding the throttling stuff. Having the start time
taken when the thread really starts is just sanity, and I needed that just
to rule out that it was the source of the "strange" measures.
-j 800 vs -j 100 : ITM that if I you create more threads, the time delay
incurred is cumulative, so the strangeness of the result should worsen.
800 threads ~ possibly 10 seconds of creation time, to be compared to a
few seconds of run time.
>>> Shouldn't it just be:
>>>
>>> int64 wait = (int64) (throttle_delay *
>>> -log(1 - pg_erand48(thread->random_state)));
>> [...]
> Ah; that makes sense. Then I understand why you want to remove the bias, but
> why do you also increase the upper bound?
Because the bias was significantly larger for 1000 (about 0.5%), so this
also contributed to reduce said bias, and 9.2 times the average target
seems as reasonnable a bound as 6.9.
>> It is also printed without --rate. There is a "if" above because there is
>> one report with "lag" (under --rate), and one without.
>
> The code I quoted is for the final report in printResults(), and that only
> shows latency mean/stddev when using --rate. The progress reporting in
> threadRun() does have two variations as you describe.
Indeed, I took it for the progress report. I'll check. It must be
consistent whether under --rate or not.
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-09-22 09:10:57 | Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE |
Previous Message | Amit Kapila | 2013-09-22 07:47:52 | Re: dynamic shared memory |