Re: adding new pages bulky way

From: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: adding new pages bulky way
Date: 2005-06-08 08:03:36
Message-ID: d868sp$1efk$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes
>
> I very seriously doubt that there would be *any* win
>

I did a quick proof-concept implemenation to test non-concurrent batch
insertion, here is the result:

Envrionment:
- Pg8.0.1
- NTFS / IDE

-- batch 16 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4167.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8111.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16444.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41980.000 ms

-- batch 32 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4086.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 7861.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16403.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41290.000 ms

-- batch 64 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4236.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8202.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17265.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 44063.000 ms

-- batch 128 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4256.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8242.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17375.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 43854.000 ms

-- one page extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4496.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 9013.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 19508.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 49962.000 ms

Benefits are there, and it is an approximate 10% improvement if we select
good batch size. The explaination is: if a batch insertion need 6400 new
pages, originally it does write()+file system logs 6400 times, now it does
6400/64 times(though each time the time cost is bigger). Also, considering
write with different size have different cost, seems for my machine 32 is
the an optimal choice.

What I did include:

(1) md.c
Modify function mdextend():
- extend 64 pages each time;
- after extension, let FSM be aware of it (change FSM a little bit so it
could report freespace also for an empty page)

(2) bufmgr.c
make ReadPage(+empty_page) treat different of an empty page and non-empty
one to avoid unnecesary read for new pages, that is:
if (!empty_page)
smgrread(reln->rd_smgr, blockNum, (char *) MAKE_PTR(bufHdr->data));
else
PageInit((char *) MAKE_PTR(bufHdr->data), BLCKSZ, 0); /* Only for
heap pages and race could be here ... */

(3) hio.c
RelationGetBufferForTuple():
- pass correct "empty_page" parameter to ReadPage() according to the query
result from FSM.

Regards,
Qingqing

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2005-06-08 08:22:38 Re: The Contrib Roundup (long)
Previous Message Michael Meskes 2005-06-08 07:27:25 Re: linuxtag 2005