From: | Peter Geoghegan <peter(at)2ndquadrant(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Draft release notes complete |
Date: | 2012-05-14 19:21:37 |
Message-ID: | CAEYLb_VWwLYF_j0y+KoLPCaN_tMYwhryU=abD6CgKEvXujbr-w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 14 May 2012 17:06, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> So this group commit happens
> even if users don't change these?
>
> #commit_delay = 0 # range 0-100000, in microseconds
> #commit_siblings = 5 # range 1-1000
Yes, that's right - the new group commit is not configurable, and is
used all the time. I believe that very few people ever change these
other settings in production, because it is very risky to tweak them.
Read the description of them in Greg Smith's book for a good example
of what I mean, or read the docs - "it is rare that adding delay by
increasing this parameter will actually improve performance". There
may actually be no one that's set commit_delay > 0 at all. All these
settings do is insert a sleep of commit_delay microseconds, in the
hope that the call to XLogFlush() will then find that doing work is
unnecessary due to some other session having got there first.
> So the new release item wording will be:
>
> Add group commit capability for sessions that commit at the same
> time
I'd say that's almost what I'd like to see, because commit_delay falls
far short of the definition of group commit established by other
RDBMSs, which is, I assume, why the term was never used for advocacy
purposes. The old facility could result in batching of commits
specifically by artificially adding latency so that after the delay
the sessions found they could fastpath. While it worked under
artificial conditions, I think it resulted in relatively small
improvements next to new group commit's order-of-magnitude increase in
throughput that was demonstrated for a maximally commit-bound
workload.
The mere ability to notice that an XLogFlush() call is unnecessary and
fastpath out could be argued to be an aboriginal group commit,
predating even commit_delay, as could skipping duplicate fsync()
requests in XLogWrite(), which I think Jeff pointed out, but I don't
think anyone actually takes this position.
What I'd suggest is something like:
"""
Add group commit capability so that when a session commits, other,
concurrent sessions with pending commits will later automatically
check if their commit has been satisfied by the original request (or
even some other, intervening request) before actually proceeding.
This results in a large increase in transaction throughput for
commit-bound workloads, as under these circumstances the actual number
of flush requests will be considerably lower than that of earlier
versions of PostgreSQL.
"""
Now, granted, the number of actual flush requests being so high wasn't
a problem because of any actual number of follow-through duplicate
fsync() system calls (or writes) - XLogWrite() notices them, and
probably has forever (although I read suggestion that some MySQL
flavours might have actually been stupid enough to do so pre group
commit). But in order to be able to notice this, backends previously
had to get the WALWriteLock exclusively. I don't think that these are
the kind of qualifications that belong here though. People rightfully
only care about the practical implications, and whether or not we're
competitive in various respects, such as whether or not we tick the
group commit box, and whether or not we have pretty graphs that are
comparable to those of other database systems.
> This is the git commit message:
>
> Make group commit more effective.
>
> When a backend needs to flush the WAL, and someone else is already flushing
> the WAL, wait until it releases the WALInsertLock and check if we still need
> to do the flush or if the other backend already did the work for us, before
> acquiring WALInsertLock. This helps group commit, because when the WAL flush
> finishes, all the backends that were waiting for it can be woken up in one
> go, and the can all concurrently observe that they're done, rather than
> waking them up one by one in a cascading fashion.
>
> This is based on a new LWLock function, LWLockWaitUntilFree(), which has
> peculiar semantics. If the lock is immediately free, it grabs the lock and
> returns true. If it's not free, it waits until it is released, but then
> returns false without grabbing the lock. This is used in XLogFlush(), so
> that when the lock is acquired, the backend flushes the WAL, but if it's
> not, the backend first checks the current flush location before retrying.
>
> Original patch and benchmarking by Peter Geoghegan and Simon Riggs, although
> this patch as committed ended up being very different from that.
>
> (Heikki Linnakangas)
>
> Is that commit message inaccurate?
I wouldn't say that it's inaccurate. However, if we were to deprecate
commit_delay and commit_siblings, that would be a mere matter of
removing this code (and the GUC stuff itself):
if (CommitDelay > 0 && enableFsync &&
MinimumActiveBackends(CommitSiblings))
pg_usleep(CommitDelay);
This code is just before one call (of several) to XLogFlush(). The
guts of the new group commit are in that function itself. While my
call to deprecate commit_delay + commit_siblings in another thread may
have been premature (more on that later), this is basically orthogonal
code. That said, most of the possible use-cases for "old" group commit
may overlap with that of "new". So in that very limited sense only it
is an improvement on the old group commit. Again, I don't think that
that's the kind of subtle distinction that needs to be made here. This
is a documented intended for the widest possible audience.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2012-05-14 19:29:19 | Re: Draft release notes complete |
Previous Message | Tom Lane | 2012-05-14 19:11:44 | Re: Re: [COMMITTERS] pgsql: Ensure age() returns a stable value rather than the latest value |