From: | "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> |
---|---|
To: | <pgsql-patches(at)postgresql(dot)org> |
Subject: | CopyReadLineText optimization |
Date: | 2008-02-24 01:29:47 |
Message-ID: | 47C0C88B.8090904@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
The purpose of CopyReadLineText is to scan the input buffer, and find
the next newline, taking into account any escape characters. It
currently operates in a loop, one byte at a time, searching for LF, CR,
or a backslash. That's a bit slow: I've been running oprofile on COPY,
and I've seen CopyReadLine to take around ~10% of the CPU time, and
Joshua Drake just posted a very similar profile to hackers.
Attached is a patch that modifies CopyReadLineText so that it uses
memchr to speed up the scan. The nice thing about memchr is that we can
take advantage of any clever optimizations that might be in libc or
compiler.
In the tests I've been running, it roughly halves the time spent in
CopyReadLine (including the new memchr calls), thus reducing the total
CPU overhead by ~5%. I'm planning to run more tests with data that has
backslashes and with different width tables to see what the worst-case
and best-case performance is like. Also, it doesn't work for CSV format
at the moment; that needs to be fixed.
5% isn't exactly breathtaking, but it's a start. I tried the same trick
to CopyReadAttributesText, but unfortunately it doesn't seem to help
there because you need to "stop" the efficient word-at-a-time scan that
memchr does (at least with glibc, YMMV) whenever there's a column
separator, while in CopyReadLineText you get to process the whole line
in one call, assuming there's no backslashes.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
copy-readline-memchr-2.patch | text/x-diff | 4.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2008-02-24 01:45:51 | Re: 8.3 / 8.2.6 restore comparison |
Previous Message | Heikki Linnakangas | 2008-02-24 00:43:18 | Re: 8.3 / 8.2.6 restore comparison |
From | Date | Subject | |
---|---|---|---|
Next Message | Luke Lonergan | 2008-02-24 01:46:40 | Re: CopyReadLineText optimization |
Previous Message | Mathias Hasselmann | 2008-02-23 21:19:17 | Re: Avahi support for Postgresql |