From: | Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Speeding up GIST index creation for tsvectors |
Date: | 2020-12-10 14:31:31 |
Message-ID: | CAJ3gD9eGvKkoZ5+3mkM9jmw1kUhKwQpMEgE-fo8uTSqHMnsmqg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
In hemdistsign() of tsgistidx.c, if we process the bits in 64-bit
chunks rather than byte-by-byte, we get an overall speed up in Gist
index creation for tsvector types. With default siglen (124), the
speed up is 12-20%. With siglen=700, it is 30-50%. So with longer
signature lengths, we get higher percentage speed-up. The attached
patch 0001 has the required changes.
In the patch 0001, rather than using xor operator on char values, xor
is operated on 64-bit chunks. And since the chunks are 64-bit,
popcount64() is used on each of the chunks. I have checked that the
two bitvector pointer arguments of hemdistsign() are not always 64-bit
aligned. So process the leading mis-aligned bits and the trailing
remainder bits char-by-char, leaving the middle 64-bit chunks for
popcount64() usage.
We might extend this to the hemdistsign() definitions at other places
in the code. But for now, we can start with gist. I haven't tried
other places.
-------------
While working on this, I observed that on platforms other than x86_64,
we still declare pg_popcount64() as a function pointer, even though we
don't use the runtime selection of right function using__get_cpuid()
as is done on x86.
The other patch i.e. 0002 is a general optimization that avoids this
function pointer for pg_popcount32/64() call. The patch arranges for
direct function call so as to get rid of function pointer
dereferencing each time pg_popcount32/64 is called.
To do this, define pg_popcount64 to another function name
(pg_popcount64_nonasm) rather than a function pointer, whenever
USE_POPCNT_ASM is not defined. And let pg_popcount64_nonasm() be a
static inline function so that whenever pg_popcount64() is called,
directly the __builtin_popcount() gets called. For platforms not
supporting __builtin_popcount(), continue using the slow version as is
the current behaviour.
Tested this 0002 patch on ARM64, with patch 0001 already applied, and the
gist index creation for tsvectors *further* speeds up by 6% for
default siglen (=124), and by 12% with siglen=700.
-------------
Schema :
CREATE TABLE test_tsvector(t text, a tsvector);
-- Attached tsearch.data (a bigger version of
-- src/test/regress/data/tsearch.data)
\COPY test_tsvector FROM 'tsearch.data';
Test case that shows improvement :
CREATE INDEX wowidx6 ON test_tsvector USING gist (a);
Time taken by the above create-index command, in seconds, along with %
speed-up w.r.t. HEAD :
A) siglen=124 (Default)
head 0001.patch 0001+0002.patch
x86 .827 .737 (12%) .....
arm 1.098 .912 (20%) .861 (28%)
B) siglen=700 (... USING gist (a tsvector_ops(siglen=700))
head 0001.patch 0001+0002.patch
x86 1.121 .847 (32%) .....
arm 1.751 1.191 (47%) 1.062 (65%)
--
Thanks,
-Amit Khandekar
Huawei Technologies
Attachment | Content-Type | Size |
---|---|---|
tsearch.data.gz | application/gzip | 3.1 MB |
0001-Speed-up-xor-ing-of-two-gist-index-signatures-for-ts.patch | text/x-patch | 3.8 KB |
0002-Avoid-function-pointer-dereferencing-for-pg_popcount.patch | text/x-patch | 7.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Gilles Darold | 2020-12-10 14:45:56 | Re: MultiXact\SLRU buffers configuration |
Previous Message | David G. Johnston | 2020-12-10 14:19:31 | Re: Insert Documentation - Returning Clause and Order |