From: | "Ryohei Takahashi (Fujitsu)" <r(dot)takahashi_2(at)fujitsu(dot)com> |
---|---|
To: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | COPY performance on Windows |
Date: | 2024-11-05 02:34:20 |
Message-ID: | TY3PR01MB118915F007BE81B054ACB324682522@TY3PR01MB11891.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I noticed that the COPY performance on PG 17.0 Windows is worse than PG 16.4.
* Environment
OS: Windows Server 2022
CPU: 22 core * 2CPU
Memory: 512GB
Storage: 700GB HDD
* Input data
10GB csv file
* Executed command
psql -c 'copy table from '\''C:\data.csv'\'' WITH csv'
(Only one psql command)
* Performance
PG 16.4: 405.2s
PG 17.0: 417.4s
* Analysis
I noticed that the commit 82a4edabd2 affects the performance.
The logic of mdzeroextend() is following.
if (numblocks > 8)
{
...
ret = FileFallocate(); // if HAVE_POSIX_FALLOCATE, call flloacate(), else pwrite()
...
}
else
{
...
ret = FileZero(); // call pwrite()
...
}
In XFS filesystem, switching fallocate() and pwritev() reduce performance.
So, 82a4edabd2 increased numblocks to call fallocate() if fallocate() is once called.
On the other hand, Windows does not have fallocate().
So, pwrite() is always called regardless of numblocks.
As a result, 82a4edabd2 just increased the numblocks to be written on Windows.
* Improvement
I think 82a4edabd2 is only effective for the HAVE_POSIX_FALLOCATE system.
So, I made the attached patch.
By applying the attached patch to PG 17.0, the copy result is 401.5s.
How do you think about this?
Regards,
Ryohei Takahashi
Attachment | Content-Type | Size |
---|---|---|
001-skip-increasing-numblocks-without-fallocate-system.patch | application/octet-stream | 828 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Ryohei Takahashi (Fujitsu) | 2024-11-05 03:02:23 | doc: pgevent.dll location |
Previous Message | Zhijie Hou (Fujitsu) | 2024-11-05 02:27:57 | RE: Conflict detection for update_deleted in logical replication |