From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: pgbench - merging transaction logs |
Date: | 2015-03-28 10:21:06 |
Message-ID: | alpine.DEB.2.10.1503281016080.17117@sto |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Tomas,
> I do agree that fprintf is not cheap, actually when profiling pgbench
> it's often the #1 item, but the impact on the measurements is actually
> quite small. For example with a small database (scale 10) and read-only
> 30-second runs (single client), I get this:
>
> no logging: 18672 18792 18667 18518 18613 18547
> with logging: 18170 18093 18162 18273 18307 18234
>
> So on average, that's 18634 vs. 18206, i.e. less than 2.5% difference.
> And with more expensive transactions (larger scale, writes, ...) the
> difference will be much smaller.
I did some testing with a scale 10 prepared "SELECT only" 200 seconds
plenty of runs with local socket connections on the largest host I have
available:
pgbench -P 10 -T 200 -S -M prepared -j $c -c $c ...
I think that this cpu-bound bench is kind of a worst possible case for the
detailed per transaction log.
I also implemented a quick and dirty version for a merge log based on
sharing a file handle (append mode + sprintf + fputs).
The results are as follow:
* 1 thread 33 runs median tps (average is consistent):
- no logging: 22062
- separate logging: 19360 (-12.2%)
- merged logging: 19326 (-12.4%, not significant from previous)
Note that the impact of logging is much larger than with your tests.
The underlying fprintf comes from gnu libc 2.19.
The worst overhead I could trigger is with 12 threads:
* 12 threads 35 runs median tps (average is consistent)
- no logging: 155917
- separate logging: 124695 (-20.0%)
- merged logging: 119349 (-23.5%)
My conclusion from these figures is that although the direct merged
logging approach adds some overhead, this overhead is small wrt detailed
logging (it adds 3.5% to a 20% logging overhead) with 12 threads. Other
tests, even with more threads, did not yield larger absolute or relative
overheads. Although the direct merge approach is shown to add overheads,
this is a small additional overhead on a quite bad situation already,
which suggest that using detailed log on a cpu-bound pgbench run is a bad
idea to begin with.
For a more realistic test, I ran "simple updates" which involve actual
disk writes. It ran at around 840 tps with 24 threads. The logging
overhead seems to be under 1%, and there is no significant difference
between separate and merged on the 20 runs.
So my overall conclusion is:
(1) The simple thread-shared file approach would save pgbench from
post-processing merge-sort heavy code, for a reasonable cost.
(2) The feature would not be available for the thread-emulation with this
approach, but I do not see this as a particular issue as I think that it
is pretty much only dead code and a maintenance burden.
(3) Optimizing doLog from its current fprintf-based implementation may be
a good thing.
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2015-03-28 12:19:51 | Re: Rounding to even for numeric data type |
Previous Message | Dean Rasheed | 2015-03-28 09:01:43 | Re: Rounding to even for numeric data type |