| From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: [PATCH] 2PC state files on shared memory |
| Date: | 2009-08-08 13:31:47 |
| Message-ID: | 4A7D7E43.9000600@enterprisedb.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Tom Lane wrote:
> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
>> Based on an idea of Heikki Linnakangas, here is a patch in order to improve
>> 2PC
>> by sending the state files of prepared transactions to shared memory instead
>> of disk.
>
> I don't understand how this can possibly work. The entire point of
> 2PC is that the state file is guaranteed to be on disk so it will
> survive a crash. What good is it if it's in shared memory?
The state files are not fsync'd when they're written, but a copy is
written to WAL so that it can be replayed on crash. With this patch,
it's still written to WAL, but the write to a file on disk is skipped,
and it's stored in shared memory instead.
> Quite aside from that, the fixed size of shared memory makes this seem
> pretty impractical.
Most state files are small. If one doesn't fit in the area reserved for
this, it's written to disk as usual. It's just an optimization.
I'm a bit disappointed by the performance gains. I would've expected
more, given a decent battery-backed-up cache to buffer the WAL fsyncs.
But it looks like they're still causing the most overhead, even with a
battery-backed-up cache.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2009-08-08 13:39:48 | Re: GEQO vs join order restrictions |
| Previous Message | Robert Haas | 2009-08-08 13:08:30 | Re: hot standby - merged up to CVS HEAD |