From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | John Naylor <johncnaylorls(at)gmail(dot)com> |
Cc: | Ants Aasma <ants(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: add AVX2 support to simd.h |
Date: | 2024-03-19 03:16:01 |
Message-ID: | 20240319031601.GA828141@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote:
> I took a brief look, and 0001 isn't quite what I had in mind. I can't
> quite tell what it's doing with the additional branches and "goto
> retry", but I meant something pretty simple:
Do you mean 0002? 0001 just adds a 2-register loop for remaining elements
once we've exhausted what can be processed with the 4-register loop.
> - if short, do one element at a time and return
0002 does this.
> - if long, do one block unconditionally, then round the start pointer
> up so that "end - start" is an exact multiple of blocks, and loop over
> them
0002 does the opposite of this. That is, after we've completed as many
blocks as possible, we move the iterator variable back to "end -
block_size" and do one final iteration to cover all the remaining elements.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2024-03-19 03:23:47 | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Previous Message | John Naylor | 2024-03-19 03:03:36 | Re: add AVX2 support to simd.h |