From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: SP-GiST, Space-Partitioned GiST |
Date: | 2011-09-22 10:52:51 |
Message-ID: | CAPpHfduCoiSWL7=EQ9XgaQifWVubz8xDiA6feQvix+6PqH+eRw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 22, 2011 at 2:05 PM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Regarding the quadtree, have you compared the performance of that with
> Alexander's improved split algorithm? I ran some tests using the test
> harness I still had lying around from the fast GiST index build tests:
>
> testname | time | accesses | indexsize
> -------------------------+----**-------------+----------+-----**------
> points unordered auto | 00:03:58.188866 | 378779 | 522 MB
> points ordered auto | 00:07:14.362355 | 177534 | 670 MB
> points unordered auto | 00:02:59.130176 | 46561 | 532 MB
> points ordered auto | 00:04:00.50756 | 45066 | 662 MB
> points unordered spgist | 00:03:05.569259 | 78871 | 394 MB
> points ordered spgist | 00:01:46.06855 | 422104 | 417 MB
> (8 rows)
>
I assume first two rows to be produced by new linear split
algorithm(current) and secound two rows by double sorting split algorithm(my
patch).
> These tests were with a table with 7500000 random points. In the
> ordered-tests, the table is sorted by x,y coordinates. 'time' is the time
> used to build the index on it, and 'accesses' is the total number of index
> blocks hit by a series of 10000 bounding box queries, measured from
> pg_statio_user_indexes.idx_**blks_hit + idx_blks_read.
>
> The first two tests in the list are with a GiST index on unpatched
> PostgreSQL. The next six tests are with Alexander's double-sorting split
> patch. The last two tests are with an SP-GiST index.
>
> It looks like the query performance with GiST using the double-sorting
> split is better than SP-GiST, although the SP-GiST index is somewhat
> smaller. The ordered case seems pathologically bad, is that some sort of a
> worst-case scenario for quadtrees?
Comparison of search speed using number of page accesses is
quite comprehensive for various GiST indexes. But when we're
comparing SP-GiST vs GiST we should take into accoung that they have
different CPU/IO ratio. GiST scans whole page which it accesses. SP-GiST can
scan only fraction of page because several nodes can be packed into single
page. Thereby it would be interesting to compare also CPU load GiST
vs. SP-GiST. Also, there is some hope to reduce number of page accesses in
SP-GiST by improving clustering algorithm.
------
With best regards,
Alexander Korotkov.
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2011-09-22 11:22:36 | Re: Double sorting split patch |
Previous Message | MUHAMMAD ASIF | 2011-09-22 10:51:11 | Re: PostgreSQL X/Open Socket / BSD Socket Issue on HP-UX |