From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | smithpb2250(at)gmail(dot)com |
Cc: | vignesh21(at)gmail(dot)com, dgrowleyml(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Improvements in Copy From |
Date: | 2020-09-11 09:04:01 |
Message-ID: | 20200911.180401.1250008268606505036.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Fri, 11 Sep 2020 18:44:13 +1000, Peter Smith <smithpb2250(at)gmail(dot)com> wrote in
> On Thu, Sep 10, 2020 at 9:21 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > Whether such a micro-optimisation is worth doing is another question.
> > Yes, what you suggested can also be done, but even I have the same
> > question as you. Because we will reduce just one function call, the
> > eof check is present immediately in the function, Should we include
> > this or not?
>
> I expect the difference from my suggestion is too small to be measured.
>
> Probably it is not worth changing the already complicated code unless
> those changes can achieve something observable.
>
> ~~
>
> FYI, I ran a few performance tests BEFORE/AFTER applying your patch.
>
> Perf results for \COPY 5GB CSV file to UNLOGGED table.
>
> perf -a –g <pid>
> psql -d test -c "\copy tbl from '/my/path/data_5GB.csv' with (format csv);”
> perf report –g
>
> BEFORE
> #1 CopyReadLineText = 12.70%, CopyLoadRawBuf = 0.81%
> #2 CopyReadLineText = 12.54%, CopyLoadRawBuf = 0.81%
> #3 CopyReadLineText = 12.52%, CopyLoadRawBuf = 0.81%
>
> AFTER
> #1 CopyReadLineText = 12.55%, CopyLoadRawBuf = 1.20%
> #2 CopyReadLineText = 12.15%, CopyLoadRawBuf = 1.10%
> #3 CopyReadLineText = 13.11%, CopyLoadRawBuf = 1.24%
> #4 CopyReadLineText = 12.86%, CopyLoadRawBuf = 1.18%
>
> I didn't quite know how to interpret those results. It was opposite
> what I expected. Perhaps the original excessive CopyLoadRawBuf calls
> were so brief they could often avoid being sampled? Anyway, I hope you
> have a better understanding of perf than I do and can explain it.
>
> I then repeated/times same tests but without perf
>
> BEFORE
> #1 4min.36s
> #2 4min.45s
> #3 4min.43s
> #4 4min.34s
>
> AFTER
> #1 4min.41s
> #2 4min.37s
> #3 4min.34s
>
> As you can see, unfortunately, the patch gave no observable benefit
> for my test case.
That observation agrees with my assumption.
At Fri, 11 Sep 2020 15:58:04 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
me> we should do that. On the contrary, if incoming data were
me> intermittently delayed for some reasons (heavy load of client or
me> in-between network), this patch would make things worse by waiting for
me> delayed bits before processing already received bits.
It seems that a slow network is enough to cause that behavior even
without any trouble,
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | tsunakawa.takay@fujitsu.com | 2020-09-11 09:24:00 | RE: Transactions involving multiple postgres foreign servers, take 2 |
Previous Message | Amit Kapila | 2020-09-11 09:03:20 | Re: Bug in logical decoding of in-progress transactions |