From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Ants Aasma <ants(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: add AVX2 support to simd.h |
Date: | 2024-03-19 09:53:04 |
Message-ID: | CANWCAZafKPUBYdNdtqZLVxVJhSn-ONeo_tp1FsODcn7udjKwRQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 19, 2024 at 10:16 AM Nathan Bossart
<nathandbossart(at)gmail(dot)com> wrote:
>
> On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote:
> > I took a brief look, and 0001 isn't quite what I had in mind. I can't
> > quite tell what it's doing with the additional branches and "goto
> > retry", but I meant something pretty simple:
>
> Do you mean 0002? 0001 just adds a 2-register loop for remaining elements
> once we've exhausted what can be processed with the 4-register loop.
Sorry, I was looking at v2 at the time.
> > - if short, do one element at a time and return
>
> 0002 does this.
That part looks fine.
> > - if long, do one block unconditionally, then round the start pointer
> > up so that "end - start" is an exact multiple of blocks, and loop over
> > them
>
> 0002 does the opposite of this. That is, after we've completed as many
> blocks as possible, we move the iterator variable back to "end -
> block_size" and do one final iteration to cover all the remaining elements.
Sounds similar in principle, but it looks really complicated. I don't
think the additional loops and branches are a good way to go, either
for readability or for branch prediction. My sketch has one branch for
which loop to do, and then performs only one loop. Let's do the
simplest thing that could work. (I think we might need a helper
function to do the block, but the rest should be easy)
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2024-03-19 09:57:57 | Re: Catalog domain not-null constraints |
Previous Message | Peter Eisentraut | 2024-03-19 09:50:23 | Re: Inconsistent printf placeholders |