From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Adam Lee <ali(at)pivotal(dot)io> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: Add LogicalTapeSetExtend() to logtape.c |
Date: | 2020-03-04 20:06:07 |
Message-ID: | 91f9ae9935c715d5638954f52e0e33c4d1dc1388.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 2020-03-04 at 11:57 +0800, Adam Lee wrote:
> Master(e537aed61d): 13342.844 ms 13195.982 ms 13271.023 ms
> With my patch(a pointer): 13020.029 ms 13008.158 ms 13063.658 ms
> With your patch(flexible array): 12870.117 ms 12814.725 ms 13119.255
> ms
I tracked the problem down.
When we change from a flexible array to a pointer and a separately-
allocated chunk, then it causes some unrelated code in
LogicalTapeWrite() to be optimized differently in my environment.
Specifically, the memcpy() in LogicalTapeWrite() gets turned into an
inline implementation using the "rep movsq" instruction. For my
particular case, actually calling memcpy (rather than using the inlined
version) is a lot faster.
In my test case, LogicalTapeWrite() was being called with size of 4 or
10, so perhaps those small values are just handled much more
efficiently in the real memcpy.
To get it to use the real memcpy(), I had to '#undef _FORTIFY_SOURCE'
at the top of the file, and pass -fno-builtin-memcpy. Then the
regression went away.
I don't care for the version I posted where it repalloc()s the entire
struct. That seems needlessly odd and could cause confusion; and it
also changes the API so that the caller needs to update its pointers.
I'm inclined to use a version similar to Adam's, where it has a pointer
and a separate palloc() chunk (attached). That causes a regression in
my setup, but it's not a terrible regression, and it's not really the
"fault" of the change. Trying to twist code around to satisfy a
particular compiler/libc seems like a bad idea. It also might depend on
the exact query, and may even be faster for some.
Any objections?
Regards,
Jeff Davis
Attachment | Content-Type | Size |
---|---|---|
logtape-20200303.patch | text/x-patch | 4.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nikita Glukhov | 2020-03-04 20:18:52 | Re: jsonpath syntax extensions |
Previous Message | Nikolay Shaplov | 2020-03-04 19:58:31 | Re: [PATCH] Finally split StdRdOptions into HeapOptions and ToastOptions |