From: | Boris Muratshin <bmuratshin(at)gmail(dot)com> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: 3D Z-curve spatial index |
Date: | 2017-02-06 07:01:09 |
Message-ID: | CAO+8Nm6yBMW2bqfL=Bugs0Z0xMe+CrMDrnYxOAGrUYV-nyakmw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks David,
I am really a novice in the community.
But thing I wrote about is not a patch to something but a standalone
extension.
And I've placed it on github (https://github.com/bmuratshin/zcurve/tree/3D)
+ attached.
The algorithm is not documented well because it is rather an experimental
work
and I am frequently changing the interfaces.
In the README.zcurve file you can find the instruction how to get the
numbers in the figures above,
and just in case I'll post it here:
------------------------------------------------------------------------------------------------------------------
To test SQL 3D Z-curve interface (on built and registered extension) you
need:
1) Create points table (psql):
create table test_points_3d (x integer,y integer, z integer);
2) Create test dataset:
2.1) Make gawk script, let it be 'mkdata.awk':
BEGIN{
for (i = 0; i < 100000000; i++)
{
x = int(1000000 * rand());
z = int(1000000 * rand());
print "("x",0,"z")";
}
}
2.2) Fill txt file with this script:
gawk -f mkdata.awk > data_3d.csv
3.3) copy it into the table (psql)
COPY test_points_3d from '/home/.../postgresql/contrib/zcurve/data_3d.csv';
3) Create index (psql):
create index zcurve_test_points_3d on test_points_3d(zcurve_num_from_xyz(x,
y, z));
4) Create test requests set:
4.1) Make gawk script, let it be 'mktest.awk':
BEGIN{
for (i = 0; i < 100000; i++)
{
x = 1000 * int(1000 * rand());
y = 0;
z = 1000 * int(1000 * rand());
#EXPLAIN (ANALYZE,BUFFERS)
print "select count(1) from
zcurve_3d_lookup_tidonly('zcurve_test_points_3d',
"x","y","z","x+1000","y+1000","z+1000");";
}
}
box 100X100X1000 gives us 100 points in average
2.2) Fill testing file with this script:
gawk -f mktest.awk > test_100.sql
5) For getting times, run:
date; psql -f test_100.sql ; date;
and divide the difference by 100 000
5) For getting reads,
5.1) insert 'EXPLAIN (ANALYZE,BUFFERS)' into select preparation in gawk
script:
5.2) run 'date; psql -f test_100.sql > 100.out ; date;'
5.3) prepare gawk script (let it be 'buf.awk')
BEGIN {
FS="=";
num = 0;
sum = 0;
sum2 = 0;
}
if (NF==3)
{
sum += $3;
sum2 += $2;
}
else if (NF==2)
{
if ($1==" Buffers: shared hit")
sum2 += $2;
else if ($1==" Buffers: read")
sum += $2;
}
num++;
END {
print sum/(num)" "sum2/(num)" "num;
}
5.4) run in over 100.out
gawk -f buf.awk 100.out
5.5) first number in its output is 'shared reads', second - 'shared hits'
------------------------------------------------------------------------------------------------------------------
Regards,
Boris
Attachment | Content-Type | Size |
---|---|---|
zcurve-3D.zip | application/zip | 28.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tsunakawa, Takayuki | 2017-02-06 07:01:31 | [RFC] Should "SHOW huge_pages" display the effective value "off" when the huge page is unavailable? |
Previous Message | Amit Khandekar | 2017-02-06 06:45:05 | Re: Parallel Append implementation |