Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)
Date: 2013-06-14 14:09:54
Message-ID: 51BB2432.6040102@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/12/13 3:19 AM, Fabien COELHO wrote:
> If you are still worried: if you run the very same command without
> throttling and measure the same latency, does the same thing happens at
> the end? My guess is that it should be "yes". If it is no, I'll try out
> pgbench-tools.

It looks like it happens rarely for one client without the rate limit,
but that increases to every time for multiple client with limiting in
place. pgbench-tools just graphs the output from the latency log.
Here's a setup that runs the test I'm doing:

$ createdb pgbench
$ pgbench -i -s 10 pgbench
$ pgbench -S -c 25 -T 30 -l pgbench && tail -n 40 pgbench_log*

Sometimes there's no slow entries. but I've seen this once so far:

0 21822 1801 0 1371217462 945264
1 21483 1796 0 1371217462 945300
8 20891 1931 0 1371217462 945335
14 20520 2084 0 1371217462 945374
15 20517 1991 0 1371217462 945410
16 20393 1928 0 1371217462 945444
17 20183 2000 0 1371217462 945479
18 20277 2209 0 1371217462 945514
23 20316 2114 0 1371217462 945549
22 20267 250128 0 1371217463 193656

The third column is the latency for that transaction. Notice how it's a
steady ~2000 us except for the very last transaction, which takes
250,128 us. That's the weird thing; these short SELECT statements
should never take that long. It suggests there's something weird
happening with how the client exits, probably that its latency number is
being computed after more work than it should.

Here's what a rate limited run looks like for me. Note that I'm still
using the version I re-submitted since that's where I ran into this
issue, I haven't merged your changes to split the rate among each client
here--which means this is 400 TPS per client == 10000 TPS total:

$ pgbench -S -c 25 -T 30 -R 400 -l pgbench && tail -n 40 pgbench_log

7 12049 2070 0 1371217859 195994
22 12064 2228 0 1371217859 196115
18 11957 1570 0 1371217859 196243
23 12130 989 0 1371217859 196374
8 11922 1598 0 1371217859 196646
11 12229 4833 0 1371217859 196702
21 11981 1943 0 1371217859 196754
20 11930 1026 0 1371217859 196799
14 11990 13119 0 1371217859 208014
^^^ fast section
vvv delayed section
1 11982 91926 0 1371217859 287862
2 12033 116601 0 1371217859 308644
6 12195 115957 0 1371217859 308735
17 12130 114375 0 1371217859 308776
0 12026 115507 0 1371217859 308822
3 11948 118228 0 1371217859 308859
4 12061 113484 0 1371217859 308897
5 12110 113586 0 1371217859 308933
9 12032 117744 0 1371217859 308969
10 12045 114626 0 1371217859 308989
12 11953 113372 0 1371217859 309030
13 11883 114405 0 1371217859 309066
15 12018 116069 0 1371217859 309101
16 11890 115727 0 1371217859 309137
19 12140 114006 0 1371217859 309177
24 11884 115782 0 1371217859 309212

There's almost 90,000 usec of latency showing up between epoch time
1371217859.208014 and 1371217859.287862 here. What's weird about it is
that the longer the test runs, the larger the gap is. If collecting the
latency data itself caused the problem, that would make sense, so maybe
this is related to flushing that out to disk.

If you want to look just at the latency numbers without the other
columns in the way you can use:

cat pgbench_log.* | awk {'print $3'}

That is how I was evaluating the smoothness of the rate limit, by
graphing those latency values. pgbench-tools takes those and a derived
TPS/s number and plots them, which made it easier for me to spot this
weirdness. But I've already moved onto analyzing the raw latency data
instead, I can see the issue without the graph once I've duplicated the
conditions.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-06-14 14:10:14 Re: Patch for fail-back without fresh backup
Previous Message Heikki Linnakangas 2013-06-14 14:05:34 Re: Patch for fail-back without fresh backup