From: | Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Ashwin Agrawal <ashwinstar(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Subject: | brininsert optimization opportunity |
Date: | 2023-07-03 22:21:58 |
Message-ID: | CAE-ML+9r2=aO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello hackers,
My colleague, Ashwin, pointed out to me that brininsert's per-tuple init
of the revmap access struct can have non-trivial overhead.
Turns out he is right. We are saving 24 bytes of memory per-call for
the access struct, and a bit on buffer/locking overhead, with the
attached patch.
The implementation ties the revmap cleanup as a MemoryContext callback
to the IndexInfo struct's MemoryContext, as there is no teardown
function provided by the index AM for end-of-insert-command.
Test setup (local Ubuntu workstation):
# Drop caches and restart between each run:
sudo sh -c "sync; echo 3 > /proc/sys/vm/drop_caches;"
pg_ctl -D /usr/local/pgsql/data/ -l /tmp/logfile restart
\timing
DROP TABLE heap;
CREATE TABLE heap(i int);
CREATE INDEX ON heap USING brin(i) WITH (pages_per_range=1);
INSERT INTO heap SELECT 1 FROM generate_series(1, 200000000);
Results:
We see an improvement for 100M tuples and an even bigger improvement for
200M tuples.
Master (29cf61ade3f245aa40f427a1d6345287ef77e622):
test=# INSERT INTO heap SELECT 1 FROM generate_series(1, 100000000);
INSERT 0 100000000
Time: 222762.159 ms (03:42.762)
-- 3 runs
test=# INSERT INTO heap SELECT 1 FROM generate_series(1, 200000000);
INSERT 0 200000000
Time: 471168.181 ms (07:51.168)
Time: 457071.883 ms (07:37.072)
TimeL 486969.205 ms (08:06.969)
Branch:
test2=# INSERT INTO heap SELECT 1 FROM generate_series(1, 100000000);
INSERT 0 100000000
Time: 200046.519 ms (03:20.047)
-- 3 runs
test2=# INSERT INTO heap SELECT 1 FROM generate_series(1, 200000000);
INSERT 0 200000000
Time: 369041.832 ms (06:09.042)
Time: 365483.382 ms (06:05.483)
Time: 375506.144 ms (06:15.506)
# Profiled backend running INSERT of 100000000 rows
sudo perf record -p 11951 --call-graph fp sleep 180
Please see attached perf diff between master and branch. We see that we
save on a bit of overhead from brinRevmapInitialize(),
brinRevmapTerminate() and lock routines.
Regards,
Soumyadeep (VMware)
Attachment | Content-Type | Size |
---|---|---|
perf_diff.out | application/octet-stream | 16.3 KB |
v1-0001-Reuse-revmap-and-brin-desc-in-brininsert.patch | text/x-patch | 4.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-07-03 22:29:02 | Re: Add information about command path and version of flex in meson output |
Previous Message | Tom Lane | 2023-07-03 22:20:18 | Re: Size vs size_t or, um, PgSize? |