From: | "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: adding new pages bulky way |
Date: | 2005-06-08 08:03:36 |
Message-ID: | d868sp$1efk$1@news.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes
>
> I very seriously doubt that there would be *any* win
>
I did a quick proof-concept implemenation to test non-concurrent batch
insertion, here is the result:
Envrionment:
- Pg8.0.1
- NTFS / IDE
-- batch 16 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4167.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8111.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16444.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41980.000 ms
-- batch 32 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4086.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 7861.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16403.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41290.000 ms
-- batch 64 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4236.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8202.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17265.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 44063.000 ms
-- batch 128 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4256.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8242.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17375.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 43854.000 ms
-- one page extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4496.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 9013.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 19508.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 49962.000 ms
Benefits are there, and it is an approximate 10% improvement if we select
good batch size. The explaination is: if a batch insertion need 6400 new
pages, originally it does write()+file system logs 6400 times, now it does
6400/64 times(though each time the time cost is bigger). Also, considering
write with different size have different cost, seems for my machine 32 is
the an optimal choice.
What I did include:
(1) md.c
Modify function mdextend():
- extend 64 pages each time;
- after extension, let FSM be aware of it (change FSM a little bit so it
could report freespace also for an empty page)
(2) bufmgr.c
make ReadPage(+empty_page) treat different of an empty page and non-empty
one to avoid unnecesary read for new pages, that is:
if (!empty_page)
smgrread(reln->rd_smgr, blockNum, (char *) MAKE_PTR(bufHdr->data));
else
PageInit((char *) MAKE_PTR(bufHdr->data), BLCKSZ, 0); /* Only for
heap pages and race could be here ... */
(3) hio.c
RelationGetBufferForTuple():
- pass correct "empty_page" parameter to ReadPage() according to the query
result from FSM.
Regards,
Qingqing
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2005-06-08 08:22:38 | Re: The Contrib Roundup (long) |
Previous Message | Michael Meskes | 2005-06-08 07:27:25 | Re: linuxtag 2005 |