Am 18.07.22 um 00:37 schrieb Thomas Munro:
> Seems OK for a worst case. It must still be a lot faster than doing
> it in SQL. Now I wonder what the exact requirements would be to
> dispatch to a faster version that would handle int4. I haven't
> studied this in detail but perhaps to dispatch to a fast shuffle for
> objects of size X, the requirement would be something like typlen == X
> && align_bytes <= typlen && typlen % align_bytes == 0, where
> align_bytes is typalign converted to ALIGNOF_{CHAR,SHORT,INT,DOUBLE}?
> Or in English, 'the data consists of densely packed objects of fixed
> size X, no padding'. Or perhaps you can work out the padded size and
> use that, to catch a few more types. Then you call
> array_shuffle_{2,4,8}() as appropriate, which should be as fast as
> your original int[] proposal, but work also for float, date, ...?
>
> About your experimental patch, I haven't reviewed it properly or tried
> it but I wonder if uint32 dat_offset, uint32 size (= half size
> elements) would be enough due to limitations on varlenas.
I made another experimental patch with fast tracks for typelen4 and
typelen8. alignments are not yet considered.