From: | "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> |
---|---|
To: | <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: CopyReadLineText optimization |
Date: | 2008-02-29 18:24:52 |
Message-ID: | 47C84DF4.30801@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Heikki Linnakangas wrote:
> Attached is a patch that modifies CopyReadLineText so that it uses
> memchr to speed up the scan. The nice thing about memchr is that we can
> take advantage of any clever optimizations that might be in libc or
> compiler.
Here's an updated version of the patch. The principle is the same, but
the same optimization is now used for CSV input as well, and there's
more comments.
I still need to do more benchmarking. I mentioned a ~5% speedup on the
test I ran earlier, which was a load of the lineitem table from TPC-H.
It looks like with cheaper data types the gain can be much bigger;
here's an oprofile from loading the TPC-H partsupp table,
Before:
samples % image name symbol name
5146 25.7635 postgres CopyReadLine
4089 20.4716 postgres DoCopy
1449 7.2544 reiserfs (no symbols)
1369 6.8539 postgres pg_verify_mbstr_len
1013 5.0716 libc-2.7.so memcpy
749 3.7499 libc-2.7.so ____strtod_l_internal
598 2.9939 postgres heap_formtuple
548 2.7436 libc-2.7.so ____strtol_l_internal
403 2.0176 libc-2.7.so memset
309 1.5470 libc-2.7.so strlen
208 1.0414 postgres AllocSetAlloc
...
After:
samples % image name symbol name
4165 25.7879 postgres DoCopy
1574 9.7455 postgres pg_verify_mbstr_len
1520 9.4112 reiserfs (no symbols)
1005 6.2225 libc-2.7.so memchr
986 6.1049 libc-2.7.so memcpy
632 3.9131 libc-2.7.so ____strtod_l_internal
589 3.6468 postgres heap_formtuple
546 3.3806 libc-2.7.so ____strtol_l_internal
386 2.3899 libc-2.7.so memset
366 2.2661 postgres CopyReadLine
287 1.7770 libc-2.7.so strlen
215 1.3312 postgres LWLockAcquire
208 1.2878 postgres hash_any
176 1.0897 postgres LWLockRelease
161 0.9968 postgres InputFunctionCall
157 0.9721 postgres AllocSetAlloc
...
Profile shows that with the patch, ~8.5% of the CPU time is spent in
CopyReadLine+memchr, vs. 25.5% before. That's a quite significant speedup.
I still need to test the worst-case performance, with input that has a
lot of escapes. It would be interesting to hear reports with this patch
from people on different platforms. These results are from my laptop
with 32-bit Intel CPU, running Linux. There could be big differences in
the memchr implementations.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
copy-readline-memchr-3.patch | text/x-diff | 7.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2008-02-29 19:30:10 | Re: bug or not bug, xmlvalidate(xml, text) can read and show one line from file |
Previous Message | Tom Lane | 2008-02-29 17:49:50 | Re: Buildfarm member gypsy_moth seems not to like alignment patch |
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2008-02-29 18:41:18 | Re: Fix for initdb failures on Vista |
Previous Message | Tom Lane | 2008-02-29 17:55:57 | Re: DTrace probe patch for OS X Leopard |