COPY performance on Windows

From: "Ryohei Takahashi (Fujitsu)" <r(dot)takahashi_2(at)fujitsu(dot)com>
To: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: COPY performance on Windows
Date: 2024-11-05 02:34:20
Message-ID: TY3PR01MB118915F007BE81B054ACB324682522@TY3PR01MB11891.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I noticed that the COPY performance on PG 17.0 Windows is worse than PG 16.4.

* Environment
OS: Windows Server 2022
CPU: 22 core * 2CPU
Memory: 512GB
Storage: 700GB HDD

* Input data
10GB csv file

* Executed command
psql -c 'copy table from '\''C:\data.csv'\'' WITH csv'
(Only one psql command)

* Performance
PG 16.4: 405.2s
PG 17.0: 417.4s

* Analysis
I noticed that the commit 82a4edabd2 affects the performance.

The logic of mdzeroextend() is following.

if (numblocks > 8)
{
...
ret = FileFallocate(); // if HAVE_POSIX_FALLOCATE, call flloacate(), else pwrite()
...
}
else
{
...
ret = FileZero(); // call pwrite()
...
}

In XFS filesystem, switching fallocate() and pwritev() reduce performance.
So, 82a4edabd2 increased numblocks to call fallocate() if fallocate() is once called.

On the other hand, Windows does not have fallocate().
So, pwrite() is always called regardless of numblocks.
As a result, 82a4edabd2 just increased the numblocks to be written on Windows.

* Improvement
I think 82a4edabd2 is only effective for the HAVE_POSIX_FALLOCATE system.
So, I made the attached patch.

By applying the attached patch to PG 17.0, the copy result is 401.5s.

How do you think about this?

Regards,
Ryohei Takahashi

Attachment Content-Type Size
001-skip-increasing-numblocks-without-fallocate-system.patch application/octet-stream 828 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryohei Takahashi (Fujitsu) 2024-11-05 03:02:23 doc: pgevent.dll location
Previous Message Zhijie Hou (Fujitsu) 2024-11-05 02:27:57 RE: Conflict detection for update_deleted in logical replication