From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Paul van den Bogaard <Paul(dot)Vandenbogaard(at)Sun(dot)COM> |
Subject: | Re: Reworking WAL locking |
Date: | 2008-03-23 00:05:16 |
Message-ID: | 200803230005.m2N05Gq26940@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Added to TODO:
* Improve WAL concurrency by increasing lock granularity
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00556.php
---------------------------------------------------------------------------
Simon Riggs wrote:
>
> Paul van den Bogaard (Sun) suggested to me that we could use more than
> two WAL locks to improve concurrency. I think its possible to introduce
> such a scheme with some ease. All mods within xlog.c
>
> The scheme below requires an extra LWlock per WAL buffer.
>
> Locking within XLogInsert() would look like this:
>
> Calculate length of data to be inserted.
> Calculate initial CRC
>
> LWLockAcquire(WALInsertLock, LW_EXCLUSIVE)
>
> Reserve space to write into.
> LSN = current Insert pointer
> Move pointer forward by length of data to be inserted, acquiring
> WALWriteLock if required to ensure space is available.
>
> LWLockAcquire(LSNGetWALPageLockId(LSN), LW_SHARED);
>
> Note that we don't lock every page, just the first one of the set we
> want, but we hold it until all page writes are complete.
>
> LWLockRelease(WALInsertLock);
>
> finish calculating CRC
> write xlog into reserved space
>
> LWLockRelease(LSNGetWALPageLockId(LSN));
>
> XLogWrite() will then try to get a conditional LW_EXCLUSIVE lock
> sequentially on each page it plans to write. It keeps going until it
> fails to get the lock, then writes. Callers of XLogWrite will never be
> able to pass a backend currently performing the wal buffer fill.
>
> We write whole page at a time.
>
> Next time, we do a regular lock wait on the same page, so that we always
> get a page eventually.
>
> This requires us to get 2 locks for an XLogInsert rather than just one.
> However the second lock is always acquired with zero-wait time when the
> wal_buffers are sensibly sized. Overall this should reduce wait time for
> the WALInsertLock since it seems likely that each actual filling of WAL
> buffers will effect different cache lines and are very likely to be able
> to be performed in parallel.
>
> Sounds good to me.
>
> Any objections/comments before this can be tried out?
>
> --
> Simon Riggs
> 2ndQuadrant http://www.2ndQuadrant.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-03-23 00:27:03 | Re: [HACKERS] quote_literal with NULL |
Previous Message | Bruce Momjian | 2008-03-22 23:47:12 | Re: Idea for minor tstore optimization |