Re: [PATCH] SVE popcount support

From: "Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com" <Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: "Malladi, Rama" <ramamalladi(at)hotmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Ragesh(dot)Hajela(at)fujitsu(dot)com" <Ragesh(dot)Hajela(at)fujitsu(dot)com>, Salvatore Dipietro <dipiets(at)amazon(dot)com>, "Devanga(dot)Susmitha(at)fujitsu(dot)com" <Devanga(dot)Susmitha(at)fujitsu(dot)com>
Subject: Re: [PATCH] SVE popcount support
Date: 2025-03-19 11:08:52
Message-ID: TY2PR01MB26675E7AE3638D02EFCE63F297D92@TY2PR01MB2667.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 13, 2025 at 12:02:07AM +0000, nathandbossart(at)gmail(dot)com wrote:
> Those are nice results. I'm a little worried about the Neon implementation
> for smaller inputs since it uses a per-byte loop for the remaining bytes,
> though. If we can ensure there's no regression there, I think this patch
> will be in decent shape.

True, the neon implementation in patch v6 did perform worse for smaller inputs.
This is solved in v7, we have added pg_popcount64 to speed up the processing of
smaller inputs/remaining bytes. Also, similar to sve, the neon-2reg version
performed better than neon-1reg but no improvement in neon-4reg.

The below table compares patches v6 and v7 on m7g.4xlarge
Query: SELECT drive_popcount(1000000, 8-byte words);
8-byte words | master | v6-neon-2reg| v7-neon-2reg| v7-sve
--------------+----------+-------------+-------------+--------
1 | 4.051 | 6.239 | 3.431 | 3.343
2 | 4.429 | 10.773 | 3.899 | 3.335
3 | 4.844 | 14.066 | 4.398 | 3.348
4 | 5.324 | 3.342 | 3.663 | 3.365
5 | 5.900 | 7.108 | 4.349 | 4.441
6 | 6.478 | 11.720 | 4.851 | 4.441
7 | 7.192 | 15.686 | 5.551 | 4.447
8 | 8.016 | 4.288 | 4.367 | 4.013

We modified [0] to get the numbers for pg_popcount_masked
8-byte words | master | v7-neon-2reg| v7-sve
--------------+----------+-------------+--------
1 | 4.289 | 4.202 | 3.827
2 | 4.993 | 4.662 | 3.823
3 | 5.981 | 5.459 | 3.834
4 | 6.438 | 4.230 | 3.846
5 | 7.169 | 5.236 | 5.072
6 | 7.949 | 5.922 | 5.106
7 | 9.130 | 6.535 | 5.060
8 | 9.796 | 5.328 | 4.718
512 | 387.543 | 182.801 | 77.077
1024 | 760.644 | 360.660 | 150.519

[0] https://postgr.es/m/CAFBsxsE7otwnfA36Ly44zZO+b7AEWHRFANxR1h1kxveEV=ghLQ@mail.gmail.com

-Chiranmoy

Attachment Content-Type Size
v7-0001-SVE-and-NEON-support-for-pg_popcount.patch application/octet-stream 19.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-03-19 11:12:36 Re: Vacuuming the free space map considered harmful?
Previous Message Michael Banck 2025-03-19 10:57:30 Re: Vacuuming the free space map considered harmful?