From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] O_DIRECT for WAL writes |
Date: | 2005-07-23 17:32:30 |
Message-ID: | 200507231732.j6NHWVu04215@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
I have modified and attached your patch for your review. I didn't see
any value to adding new fsync_method values because, to me, O_DIRECT is
basically just like O_SYNC except it doesn't keep a copy of the buffer
in the kernel cache. If you are doing fsync(), I don't see how O_DIRECT
makes any sense because O_DIRECT is writing to disk on every write, and
then what is the fsync() actually doing. This might explain why your
fsync/direct and open/direct performance numbers are almost identical.
Basically, if you are going to use O_DIRECT, why not use open_sync.
What I did was to add O_DIRECT unconditionally for all uses of O_SYNC
and O_DSYNC, so it is automatically used in those cases. And of course,
if your operating system doens't support O_DIRECT, it isn't used.
With your posted performance numbers, perhaps we should favor
fsync_method O_SYNC on platforms that have O_DIRECT even if we don't
support OPEN_DATASYNC, but I bet most platforms that have O_DIRECT also
have O_DATASYNC. Perhaps some folks can run testes once the patch is
applied.
---------------------------------------------------------------------------
ITAGAKI Takahiro wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > Yeah, this is about what I was afraid of: if you're actually fsyncing
> > then you get at best one commit per disk revolution, and the negotiation
> > with the OS is down in the noise.
>
> If we disable writeback-cache and use open_sync, the per-page writing
> behavior in WAL module will show up as bad result. O_DIRECT is similar
> to O_DSYNC (at least on linux), so that the benefit of it will disappear
> behind the slow disk revolution.
>
> In the current source, WAL is written as:
> for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); }
> Is this intentional? Can we rewrite it as follows?
> write(&buffers[0], N * BLCKSZ);
>
> In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff).
> Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff).
> These two patches are independent, so they can be applied either or both.
>
>
> I tested them on my machine and the results as follows. It shows that
> direct-io and gather-write is the best choice when writeback-cache is off.
> Are these two patches worth trying if they are used together?
>
>
> | writeback | fsync= | fdata | open_ | fsync_ | open_
> patch | cache | false | sync | sync | direct | direct
> ------------+-----------+--------+-------+-------+--------+---------
> direct io | off | 124.2 | 105.7 | 48.3 | 48.3 | 48.2
> direct io | on | 129.1 | 112.3 | 114.1 | 142.9 | 144.5
> gather-write| off | 124.3 | 108.7 | 105.4 | (N/A) | (N/A)
> both | off | 131.5 | 115.5 | 114.4 | 145.4 | 145.2
>
> - 20runs * pgbench -s 100 -c 50 -t 200
> - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8)
> - using 2 ATA disks:
> - hda(reiserfs) includes system and wal.
> - hdc(jfs) includes database files. writeback-cache is always on.
>
> ---
> ITAGAKI Takahiro
> NTT Cyber Space Laboratories
>
[ Attachment, skipping... ]
[ Attachment, skipping... ]
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/plain | 12.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2005-07-23 18:19:26 | Re: [HACKERS] Enticing interns to PostgreSQL |
Previous Message | Tom Lane | 2005-07-23 17:32:01 | Re: A Guide to Constraint Exclusion (Partitioning) |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2005-07-23 18:51:16 | Re: [HACKERS] regressin failure on latest CVS |
Previous Message | Bruce Momjian | 2005-07-23 17:15:12 | Re: [HACKERS] regressin failure on latest CVS |