Re: Constant time insertion into highly non-unique indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Constant time insertion into highly non-unique indexes
Date: 2005-04-14 16:10:48
Message-ID: 1342.1113495048@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Just to check if I was nuts or not, I made up a test case:

create table foo (f1 text);
create index fooi on foo(f1);

then

truncate foo;
copy foo from stdin;
0000999999
0000999998
0000999997
... one million rows ...
0000000002
0000000001
0000000000
\.

versus

truncate foo;
copy foo from stdin;
xxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxx
... one million rows ...
xxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxx
\.

The first of these should of course force a btree split on the first
page each time it splits, while the second will involve the
probabilistic moveright on each split. But the files will be exactly
the same size.

[tgl(at)rh1 ~]$ time psql -f zdecr10 test
TRUNCATE TABLE

real 1m41.681s
user 0m1.424s
sys 0m0.957s
[tgl(at)rh1 ~]$ time psql -f zsame10 test
TRUNCATE TABLE

real 1m40.927s
user 0m1.409s
sys 0m0.896s
[tgl(at)rh1 ~]$

And it just happens that I had this server built with profiling enabled,
so I was able to look at the stats for each run, and I see that
_bt_compare is actually called slightly *fewer* times in the
all-identical-data case.

zdecr10:
2954536 _bt_moveright <cycle 3> [57]
20938485 _bt_binsrch <cycle 3> [48]
0.05 0.18 6491/3006701 _bt_insertonpg <cycle 2> [16]
[33] 4.9 11.73 0.00 23899512 _bt_compare <cycle 3> [33]
21944768 FunctionCall2 <cycle 3> [24]

zsame10:
2935922 _bt_moveright <cycle 3> [62]
17156948 _bt_binsrch <cycle 3> [54]
3.45 11.09 465429/3072793 _bt_insertonpg <cycle 2> [16]
[31] 5.0 11.56 0.00 20558299 _bt_compare <cycle 3> [31]
18622167 FunctionCall2 <cycle 3> [24]

So the theory does work, at least for small index entries. Currently
repeating with wider ones ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2005-04-14 16:42:48 Re: Interactive docs idea
Previous Message Simon Riggs 2005-04-14 15:55:16 Re: Constant time insertion into highly non-unique