From: | Anton Dignös <dignoes(at)inf(dot)unibz(dot)it> |
---|---|
To: | Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: IndexJoin memory problem using spgist and boxes |
Date: | 2018-03-18 19:56:31 |
Message-ID: | CALNdv1gfKVidXz3EQBY5EcEAnNTxELM1LM6q+2rd3szKxwXhPw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
attached is the patch that uses two memory contexts.
One for calling the inner consistent function,
and a new one for keeping the traversal memory of the inner consistent function.
I run some test to compare the memory footprints. I report the total maximum
memory usage (sum of all children) of the per-query memory context
that is the parent of all memory
contexts used. The test datasets are described below.
TEST | HEAD | previous patch | patch V2
------+-------+----------------+----------
T1 | 3.4GB | 98kB | 81kB
T2 | 3.4GB | 17MB | 17MB
T3 | 3.5GB | 17MB | 17MB
T4 | 7GB | 17MB | 17MB
T5 | 8GB | 34MB | 25MB
T1: 1M x 1M tuples with relatively few overlaps
T2: as T1, but with 1 tuple in the outer relation for which the entire
index reports matches
T3: as T1, but with 10 tuples in the outer relation for which the
entire index reports matches
T4: as T3, but the outer relation has double the number of tuples
T5: as T4, but the inner relation has double the number of tuples
TEST dataset creation queries (executed incrementally)
T1:
-- create table r with 1M tuples
CREATE TABLE r AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000);
-- create table s with 1M tuples
CREATE TABLE s AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000) ORDER BY random(); -- random sort just
speeds up index creation
CREATE INDEX s_idx ON s USING spgist(box);
T2:
-- inserts a tuple for which the entire index is a match
INSERT INTO r VALUES (2, box(point(1, 1), point(1000000, 1000000)));
T3:
-- inserts 9 more tuples as in T2.
T4:
-- doubles the outer relations
INSERT INTO r SELECT * FROM r;
T5:
-- doubles the inner relation
INSERT INTO s SELECT * FROM s;
The new patch is a bit better in terms of memory by using the two
memory contexts.
I also checked the query times using explain analyze, both patches
have approximately the same runtime,
but run 5-6 times faster than HEAD.
Best regards,
Anton
Attachment | Content-Type | Size |
---|---|---|
spgist-idxscan-mem-fix-V2.patch | application/octet-stream | 2.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitry Dolgov | 2018-03-18 20:42:14 | jsonb nesting level edge case |
Previous Message | David Fetter | 2018-03-18 19:36:53 | Re: Implementing SQL ASSERTION |