From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Sutou Kouhei <kou(at)clear-code(dot)com> |
Cc: | andres(at)anarazel(dot)de, sawada(dot)mshk(at)gmail(dot)com, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date: | 2024-02-22 06:44:16 |
Message-ID: | ZdbtQJ-p5H1_EDwE@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 15, 2024 at 03:34:21PM +0900, Sutou Kouhei wrote:
> It seems that it improves performance a bit but my
> environment isn't suitable for benchmark. So they may not
> be valid numbers.
I was comparing what you have here, and what's been attached by Andres
at [1] and the top of the changes on my development branch at [2]
(v3-0008, mostly). And, it strikes me that there is no need to do any
major changes in any of the callbacks proposed up to v13 and v14 in
this thread, as all the changes proposed want to plug in more data
into each StateData for COPY FROM and COPY TO, the best part being
that v3-0008 can just reuse the proposed callbacks as-is. v1-0001
from Sutou-san would need one slight tweak in the per-row callback,
still that's minor.
I have been spending more time on the patch to introduce the COPY
APIs, leading me to the v15 attached, where I have replaced the
previous attribute callbacks for the output representation and the
reads with hardcoded routines that should be optimized by compilers,
and I have done more profiling with -O2. I'm aware of the disparities
in the per-row and start callbacks for the text/csv cases as well as
the default expressions, but these are really format-dependent with
their own assumptions so splitting them is something that makes
limited sense to me. I've also looks at externalizing some of the
error handling, though the result was not that beautiful, so what I
got here is what makes the callbacks leaner and easier to work with.
First, some results for COPY FROM using the previous tests (30 int
attributes, running on scissors, data sent to blackhole_am, etc.) in
NextCopyFrom() which becomes the hot-spot:
* Using v15:
Children Self Command Shared Object Symbol
- 66.42% 0.71% postgres postgres [.] NextCopyFrom
- 65.70% NextCopyFrom
- 65.49% CopyFromTextLikeOneRow
+ 19.29% InputFunctionCallSafe
+ 15.81% CopyReadLine
13.89% CopyReadAttributesText
+ 0.71% _start
* Using HEAD (today's 011d60c4352c):
Children Self Command Shared Object Symbol
- 67.09% 16.64% postgres postgres [.] NextCopyFrom
- 50.45% NextCopyFrom
- 30.89% NextCopyFromRawFields
+ 16.26% CopyReadLine
13.59% CopyReadAttributesText
+ 19.24% InputFunctionCallSafe
+ 16.64% _start
In this case, I have been able to limit the effects of the per-row
callback by making NextCopyFromRawFields() local to copyfromparse.c
while applying some inlining to it. This brings me to a different
point, why don't we do this change independently on HEAD? It's not
really complicated to make NextCopyFromRawFields show high in the
profiles. I was looking at external projects, and noticed that
there's nothing calling NextCopyFromRawFields() directly.
Second, some profiles with COPY TO (30 int integers, running on
scissors) where data is sent /dev/null:
* Using v15:
Children Self Command Shared Object Symbol
- 85.61% 0.34% postgres postgres [.] CopyOneRowTo
- 85.26% CopyOneRowTo
- 75.86% CopyToTextOneRow
+ 36.49% OutputFunctionCall
+ 10.53% appendBinaryStringInfo
9.66% CopyAttributeOutText
1.34% int4out
0.92% 0xffffa9803be8
0.79% enlargeStringInfo
0.77% memcpy(at)plt
0.69% 0xffffa9803be4
+ 3.12% CopySendEndOfRow
2.81% CopySendChar
0.95% pgstat_progress_update_param
0.95% appendBinaryStringInfo
0.55% MemoryContextReset
* Using HEAD (today's 011d60c4352c):
Children Self Command Shared Object Symbol
- 80.35% 14.23% postgres postgres [.] CopyOneRowTo
- 66.12% CopyOneRowTo
+ 35.40% OutputFunctionCall
+ 11.00% appendBinaryStringInfo
8.38% CopyAttributeOutText
+ 2.98% CopySendEndOfRow
1.52% int4out
0.88% pgstat_progress_update_param
0.87% 0xffff8ab32be8
0.74% memcpy(at)plt
0.68% enlargeStringInfo
0.61% 0xffff8ab32be4
0.51% MemoryContextReset
+ 14.23% _start
The increase in CopyOneRowTo from 80% to 85% worries me but I am not
quite sure how to optimize that with the current structure of the
code, so the dispatch caused by per-row callback is noticeable in
what's my worst test case. I am not quite sure how to avoid that,
TBH. A result that has been puzzling me is that I am getting faster
runtimes with v15 (6232ms in average) vs HEAD (6550ms) at 5M rows with
COPY TO for what led to these profiles (for tests without perf
attached to the backends).
Any thoughts overall?
[1]: https://www.postgresql.org/message-id/20240218015955.rmw5mcmobt5hbene%40awork3.anarazel.de
[2]: https://www.postgresql.org/message-id/ZcWoTr1N0GELFA9E@paquier.xyz
--
Michael
Attachment | Content-Type | Size |
---|---|---|
v15-0001-Extract-COPY-FROM-TO-format-implementations.patch | text/x-diff | 39.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | shveta malik | 2024-02-22 06:46:34 | Re: Synchronizing slots from primary to standby |
Previous Message | Sutou Kouhei | 2024-02-22 06:40:55 | Re: Why is pq_begintypsend so slow? |