Re: [CFReview] Red-Black Tree

From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [CFReview] Red-Black Tree
Date: 2010-02-04 13:19:20
Message-ID: 4B6AC958.6000105@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas wrote:

> Maybe we are now getting to the heart of the confusion. Mark wrote in
> his email: "Unfortunately I was not really able to reproduce the RND
> (teodor's) dataset, nor the random array test as the SQL used to test
> the implementation was not present on the page above." The SQL for
> the fixed-length tests is posted, but the SQL for the variable length
> test is not - so Mark was just guessing on that one.
>
> Or am I just totally confused?
>
> ...Robert

No, that's correct. In the "Repeat test with 100,000 identical records
varying array length (len)" section, it's fairly easy to substitute in
the varying values of len where len = 3, 30 and 50. As documented in my
review email I had a guess at generating the contents of RND (teodor's)
column with this query:

select ARRAY(select generate_series(1, (random() * 100)::int)) as arand
into arrrand from generate_series(1,100000) b;

However, unlike the other figures this is quite a bit different from
Oleg/Teodor's results which make me think this is the wrong query (3.5s
v 9s). Obviously Robert's concern here is that it is this column that
shows one of the largest performance decreases compared to head.

I've also finished benchmarking the index creation scripts yesterday on
Oleg's test dataset from
http://www.sai.msu.su/~megera/postgres/files/links2.sql.gz. With
maintenance_work_mem set to 256Mb, the times I got with the rbtree patch
applied were:

rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin);
CREATE INDEX
Time: 1910741.352 ms

rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout);
CREATE INDEX
Time: 1647609.300 ms

Without the patch applied, I ended up having to shutdown my laptop after
around 90 mins before the first index had even been created. So there is
a definite order of magnitude speed increase with this patch applied.

ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Cave-Ayland 2010-02-04 13:25:00 Re: CommitFest Status Summary - 2010-02-03
Previous Message KaiGai Kohei 2010-02-04 09:38:03 Re: Largeobject Access Controls (r2460)