From: | Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Auto-vectorization speeds up multiplication of large-precision numerics |
Date: | 2020-07-21 09:16:18 |
Message-ID: | CAJ3gD9cQiGvyPPqhj_fLaYPrDz+KniTtpmT9E3RYBWF_4ePR6A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 13 Jul 2020 at 14:27, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> I tried this in utils/adt/Makefile :
> +
> +numeric.o: CFLAGS += ${CFLAGS_VECTOR}
> +
> and it works.
>
> CFLAGS_VECTOR also includes the -funroll-loops option, which I
> believe, had showed improvements in the checksum.c runs ( [1] ). This
> option makes the object file a bit bigger. For numeric.o, it's size
> increased by 15K; from 116672 to 131360 bytes. I ran the
> multiplication test, and didn't see any additional speed-up with this
> option. Also, it does not seem to be related to vectorization. So I
> was thinking of splitting the CFLAGS_VECTOR into CFLAGS_VECTOR and
> CFLAGS_UNROLL_LOOPS. Checksum.c can use both these flags, and
> numeric.c can use only CFLAGS_VECTOR.
I did as above. Attached is the v2 patch.
In case of existing CFLAGS_VECTOR, an env variable also could be set
by that name when running configure. I did the same for
CFLAGS_UNROLL_LOOPS.
Now, developers who already are using CFLAGS_VECTOR env while
configur'ing might be using this env because their compilers don't
have these compiler options so they must be using some equivalent
compiler options. numeric.c will now be compiled with CFLAGS_VECTOR,
so for them it will now be compiled with their equivalent of
vectorize and unroll-loops option, which is ok, I think. Just that the
numeric.o size will be increased, that's it.
--
Thanks,
-Amit Khandekar
Huawei Technologies
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Auto-vectorize-loop-to-speedup-large-precision-numer.patch | text/x-patch | 8.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | k.jamison@fujitsu.com | 2020-07-21 09:38:46 | RE: Parallel Seq Scan vs kernel read ahead |
Previous Message | kato-sho@fujitsu.com | 2020-07-21 08:24:49 | RE: Performing partition pruning using row value |