From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Undocumented feature costs a lot of performance in COPY IN |
Date: | 2001-12-04 19:49:05 |
Message-ID: | 2841.1007495345@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
I have been fooling around profiling various ways of inserting wide
(8000-byte, not all that wide) bytea fields, per Brent Verner's note
of a few days ago. COPY IN should be, and is, the fastest way to
do it. But I was rather startled to discover that 25% of the runtime
of COPY IN went to an inefficient way of fetching single bytes from
pqcomm.c (pq_getbytes(&ch, 1) instead of ch = pq_getbyte()), and
20% of what's left after fixing that is going into the strchr() call
in CopyReadAttribute.
Now the point of that strchr() call is to detect whether the current
character is the column delimiter. The COPY reference page clearly
says:
By default, a text copy uses a tab ("\t") character as a
delimiter between fields. The field delimiter may be changed to
any other single character with the keyword phrase USING
DELIMITERS. Characters in data fields which happen to match the
delimiter character will be backslash quoted. Note that the
delimiter is always a single character. If multiple characters
are specified in the delimiter string, only the first character
is used.
and indeed, only the first character is used by COPY OUT. But COPY IN
is presently coded so that if multiple characters are mentioned in
USING DELIMITERS, any one of them will be taken as a field delimiter.
I would like to change the code to just "if (c == delim[0])",
which should buy back most of that 20% and make the behavior match the
documentation. Question for the list: is this a bad change? Is anyone
out there actually using this undocumented behavior?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2001-12-04 19:52:19 | Re: Problem (bug?) with like |
Previous Message | Laszlo Hornyak | 2001-12-04 19:18:52 | Re: java stored procedures |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2001-12-04 20:07:01 | Re: Undocumented feature costs a lot of performance in COPY |
Previous Message | Hannu Krosing | 2001-12-04 18:31:13 | Re: Undocumented feature costs a lot of performance in |