Quick Links

Optimizing COPY with SIMD

From:	Neil Conway <neil(dot)conway(at)gmail(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Optimizing COPY with SIMD
Date:	2024-06-02 19:17:21
Message-ID:	CAOW5sYb1HprQKrzjCsrCP1EauQzZy+njZ-AwBbOUMoGJHJS7Sw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Inspired by David Rowley's work [1] on optimizing JSON escape processing
with SIMD, I noticed that the COPY code could potentially benefit from SIMD
instructions in a few places, eg:

(1) CopyAttributeOutCSV() has 2 byte-by-byte loops
(2) CopyAttributeOutText() has 1
(3) CopyReadLineText() has 1
(4) CopyReadAttributesCSV() has 1
(5) CopyReadAttributesText() has 1

Attached is a quick POC patch that uses SIMD instructions for case (1)
above. For sufficiently large attribute values, this is a significant
performance win. For small fields, performance looks to be about the same.
Results on an M1 Macbook Pro.

======
neilconway=# select count(*), avg(length(a))::int, avg(length(b))::int,
avg(length(c))::int from short_strings;
count | avg | avg | avg
--------+-----+-----+-----
524288 | 8 | 8 | 8
(1 row)

neilconway=# select count(*), avg(length(a))::int, avg(length(b))::int,
avg(length(c))::int from long_strings;
count | avg | avg | avg
-------+-----+-----+-----
65536 | 657 | 657 | 657
(1 row)

master @ 8fea1bd541:

$ for i in ~/*.sql; do hyperfine --warmup 5 "./psql -f $i"; done
Benchmark 1: ./psql -f /Users/neilconway/copy-out-bench-long-quotes.sql
Time (mean ± σ): 2.027 s ± 0.075 s [User: 0.001 s, System: 0.000
s]
Range (min … max): 1.928 s … 2.207 s 10 runs

Benchmark 1: ./psql -f /Users/neilconway/copy-out-bench-long.sql
Time (mean ± σ): 1.420 s ± 0.027 s [User: 0.001 s, System: 0.000
s]
Range (min … max): 1.379 s … 1.473 s 10 runs

Benchmark 1: ./psql -f /Users/neilconway/copy-out-bench-short.sql
Time (mean ± σ): 546.0 ms ± 9.6 ms [User: 1.4 ms, System: 0.3 ms]
Range (min … max): 539.0 ms … 572.1 ms 10 runs

master + SIMD patch:

$ for i in ~/*.sql; do hyperfine --warmup 5 "./psql -f $i"; done
Benchmark 1: ./psql -f /Users/neilconway/copy-out-bench-long-quotes.sql
Time (mean ± σ): 797.8 ms ± 19.4 ms [User: 0.9 ms, System: 0.0 ms]
Range (min … max): 770.0 ms … 828.5 ms 10 runs

Benchmark 1: ./psql -f /Users/neilconway/copy-out-bench-long.sql
Time (mean ± σ): 732.3 ms ± 20.8 ms [User: 1.2 ms, System: 0.0 ms]
Range (min … max): 701.1 ms … 763.5 ms 10 runs

Benchmark 1: ./psql -f /Users/neilconway/copy-out-bench-short.sql
Time (mean ± σ): 545.7 ms ± 13.5 ms [User: 1.3 ms, System: 0.1 ms]
Range (min … max): 533.6 ms … 580.2 ms 10 runs
======

Implementation-wise, it seems complex to use SIMD when
encoding_embeds_ascii is true (which should be uncommon). In principle, we
could probably still use SIMD here, but it would require juggling between
the SIMD chunk size and sizes returned by pg_encoding_mblen(). For now, the
POC patch falls back to the old code path when encoding_embeds_ascii is
true.

Any feedback would be very welcome.

Cheers,
Neil

[1]
https://www.postgresql.org/message-id/CAApHDvpLXwMZvbCKcdGfU9XQjGCDm7tFpRdTXuB9PVgpNUYfEQ@mail.gmail.com

Attachment	Content-Type	Size
0002-Optimize-COPY-TO-.-FORMAT-CSV-using-SIMD-instruction.patch	application/octet-stream	5.8 KB
0001-Remove-inaccurate-comment.patch	application/octet-stream	738 bytes
copy-out-bench-short.sql	application/octet-stream	375 bytes
copy-out-bench-long.sql	application/octet-stream	373 bytes
copy-out-bench-long-quotes.sql	application/octet-stream	403 bytes

Responses

Re: Optimizing COPY with SIMD at 2024-06-03 13:22:14 from Joe Conway
Re: Optimizing COPY with SIMD at 2024-06-03 14:56:12 from Nathan Bossart

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2024-06-02 19:47:29	Re: meson and check-tests
Previous Message	Andrew Dunstan	2024-06-02 18:39:02	Re: The xversion-upgrade test fails to stop server