From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pgbench vs. wait events |
Date: | 2016-10-08 18:01:01 |
Message-ID: | CAMkU=1zevoSVzaEtU7pjpk=tLg2bkM6v9imKRFrRaYwLK0-D=A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 7, 2016 at 1:28 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Oct 7, 2016 at 11:51 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> > What happens if you turn fsync off? Once a xlog file is fully written,
> it
> > is immediately fsynced, even if the backend is holding WALWriteLock or
> > wal_insert (or both) at the time, and even if synchrounous_commit is off.
> > Assuming this machine has a BBU so that it doesn't have to wait for disk
> > rotation, still fsyncs are expensive because the kernel has to find all
> the
> > data and get it sent over to the BBU, while holding locks.
>
> Scale factor 300, 32 clients, fsync=off:
>
> 5 Lock | tuple
> 18 LWLockTranche | lock_manager
> 24 LWLockNamed | WALWriteLock
> 88 LWLockTranche | buffer_content
> 265 LWLockTranche | wal_insert
> 373 LWLockNamed | CLogControlLock
> 496 LWLockNamed | ProcArrayLock
> 532 Lock | extend
> 540 LWLockNamed | XidGenLock
> 545 Lock | transactionid
> 27067 Client | ClientRead
> 85364 |
>
Did the TPS go up appreciably?
>
> But I'm not sure you're right about the way the fsync=off code works.
> I think pg_fsync()/pg_fsync_writethrough()/pg_fsync_no_writethrough()
> look at enableFsync and just do nothing if it's false.
>
I think we are in agreement. I don't know which part you think is wrong.
When I said immediately, I didn't mean unconditionally.
Anyway, based on the reduced wait events, I think this shows that if we
need to do something in the xlog area, probably what it would be is to add
a queue of fully written but un-synced xlog files, so that the syncing can
be delegated to the background wal writer process. And of course anyone
needing to actually flush their xlog would have to start by flushing the
queue.
(Or perhaps just make the xlog files bigger, and call it a day)
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2016-10-08 18:22:26 | Re: pgbench vs. wait events |
Previous Message | Erik Rijkers | 2016-10-08 14:51:29 | regular 10devel pdf build |