Re: Long paths for tablespace leads to uninterruptible hang in Windows

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Long paths for tablespace leads to uninterruptible hang in Windows
Date: 2013-10-14 15:26:44
Message-ID: CAA4eK1KypjMoEg8ynrVS6_=EAkrAdzWiVLiFd0N4QEzH_tsbbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 14, 2013 at 8:40 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Mon, Oct 14, 2013 at 2:28 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Oct 10, 2013 at 9:34 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> On further analysis, I found that hang occurs in some of Windows
>>> API(FindFirstFile, RemoveDirectroy) when symlink path
>>> (pg_tblspc/spcoid/TABLESPACE_VERSION_DIRECTORY) is used in these
>>> API's. For above testcase, it will hang in path
>>> destroy_tablespace_directories->ReadDir->readdir->FindFirstFile
>>
>> Well, that sucks. So it's a Windows bug.
>>
>>> Some of the ways to resolve the problem are described as below:
>>>
>>> 1. I found that if the link path is accessed as a full path during
>>> readdir or stat, it works fine.
>>>
>>> For example in function destroy_tablespace_directories(), the path
>>> used to access tablespace directory is of form
>>> "pg_tblspc/16235/PG_9.4_201309051" by using below sprintf
>>> sprintf(linkloc_with_version_dir,
>>> "pg_tblspc/%u/%s",tablespaceoid,TABLESPACE_VERSION_DIRECTORY);
>>> Now when it tries to access this path it is assumed in code that
>>> corresponding OS API will take care of considering this path w.r.t
>>> current working directory, which is right as per specs,
>>> however as it hangs in OS API (FindFirstFile) if path length > 130 for
>>> symlink and if try to use full path instead of starting with
>>> pg_tblspc, it works fine.
>>> So one way to resolve this issue is to use full path for symbolic link
>>> path access instead of relying on OS to use full path.
>>
>> I'm not sure how we'd implement this, except by doing #2.
>
> If we believe it's a Windows bug, perhaps a good start would be to
> report it to Microsoft?

I had tried it on Windows forums, but didn't got any answer from them
till now. The links where I posted this are as below:
http://answers.microsoft.com/en-us/windows/forum/windows_7-performance/stat-hangs-on-windows-7-when-used-for-symbolic/f7c4573e-be28-4bbf-ac9f-de990a3f5564
http://social.technet.microsoft.com/Forums/windows/en-US/73af1516-baaf-4d3d-914c-9b22c465e527/stat-hangs-on-windows-7-when-used-for-symbolic-link?forum=TechnetSandboxForum

> There might be an "official workaround" for
> it, or in fact, there might already exist a fix for it..

The only workaround I could find is to use absolute path, and one of
the ways to fix it is that in functions like pgwin32_safestat(), call
make_absolute_path() before using path.

The other way to fix is whereever in code we use path as "pg_tblspc/",
change it to absolute path, but it is used at quite a few places and
trying to change there might make code dirty.

> We're *probably* going to have to end up deploying a workaround, but
> it would be a good idea to check first if they have a suggestion for
> how...

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Soroosh Sardari 2013-10-14 15:41:02 Re: Planner issue
Previous Message Amit Kapila 2013-10-14 15:11:05 Re: dynamic shared memory