Re: O_DIRECT support for Windows

From: "Chuck McDevitt" <cmcdevitt(at)greenplum(dot)com>
To: "Takayuki Tsunakawa" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: O_DIRECT support for Windows
Date: 2007-01-17 07:09:29
Message-ID: EB48EBF3B239E948AC1E3F3780CF8F88018BB32B@MI8NYCMAIL02.Mi8.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

People seem to be confusing sector size and cluster size.

Microsoft Windows assumes sectors are 8k or less on hard drives (99% are
512 bytes).

Cluster size is the allocation unit. On windows, this can be 512 to
256k (max 64k with 512 byte sectors).
NTFS (which I think we need) is limited to 64k, last I looked.

On RAID devices, the allocation unit might actually be larger, but
usually the *sector* size of these devices is still 8k or less (usually,
they mimic the 512 byte sector size, because too much software assumes
this)

Non-buffered I/Os don't need to be cluster boundary aligned, only sector
aligned.

And that restriction is only for certain drivers and devices. Many
don't enforce the restriction.
But to be safe, sector alignment is best, because there are some drivers
that care.

-----Original Message-----
From: pgsql-patches-owner(at)postgresql(dot)org
[mailto:pgsql-patches-owner(at)postgresql(dot)org] On Behalf Of Takayuki
Tsunakawa
Sent: Tuesday, January 16, 2007 4:53 PM
To: Magnus Hagander
Cc: ITAGAKI Takahiro; pgsql-patches(at)postgresql(dot)org
Subject: Re: [pgsql-patches] O_DIRECT support for Windows

Hello, Magnus-san, Itagaki-san

From: "Magnus Hagander" <magnus(at)hagander(dot)net>
>> I think many people can benefit from Itagaki-san's proposal, and
>> NO_BUFFERING should be default. Isn't it very rare that disks with
>> sector size larger than 8KB are used?
>
> Definitly very rare.
>
>
>> Providing a way (such as
>> wal_sync_method) to avoid NO_BUFFERING is sufficient for people in
>> rare environments. Or, by determining the sector size with
>> GetDiskFreeSpaceEx(), we could auto-switch to not using
NO_BUFFERING
>> when the sector size is larger than 8KB.
>
> I think the second one is better.

Thank you for agreeing. Then, I hope Itagaki-san's patch will be
accepted when the following treatments are added to the patch and some
performance report is delivered.

1. On Windows, O_DIRECT (and O_SYNC?) is default for WAL.
2. Auto-switch to not using O_DIRECT if the sector size is larger than
8KB when the server starts.

> A quick google shows some inconclusive results :-)BUt look at for
> example:
>
http://groups.google.se/group/microsoft.public.sqlserver.server/tree/bro
wse_frm/thread/d3288d3b43338b47/ff5e825dd02faff4?rnum=1&hl=en&q=ntfs+sec
tor+size&_done=%2Fgroup%2Fmicrosoft.public.sqlserver.server%2Fbrowse_frm
%2Fthread%2Fd3288d3b43338b47%2Fff5e825dd02faff4%3Ftvc%3D1%26q%3Dntfs+sec
tor+size%26hl%3Den%26#doc_4556b64132b3baa7
>
> This seems to indicate that *Windows* supports sector sizes >4K, but
SQL
> Server doesn't. But again, it could be a mixup between cluster and
> sector size...

This is interesting. I've never seen systems with a sector size
larger than 4KB, too. On IBM zSeries (which is a mainframe running
Linux), DASD (direct attached storage device) is usually used as a
hard disk. The sector size of DASD is 4KB. So, the current
implementation of PostgreSQL which assumes 8KB sector size is
practically sufficient.
Delivering an intuitive error message like SQL Server is one way when
PostgreSQL encounters devices with a larger sector size than is
supported. However, as you say, auto-switching to not using
NO_BUFFERING is kinder to users.

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ITAGAKI Takahiro 2007-01-17 07:43:30 Re: Dead Space Map for vacuum
Previous Message Takayuki Tsunakawa 2007-01-17 06:45:37 Re: Idea for fixing the Windows fsync problem

Browse pgsql-patches by date

  From Date Subject
Next Message tomas 2007-01-17 07:49:48 Re: Autovacuum improvements
Previous Message Takayuki Tsunakawa 2007-01-17 00:52:53 Re: O_DIRECT support for Windows