From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | markokr(at)gmail(dot)com |
Cc: | mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com |
Subject: | Re: Speed dblink using alternate libpq tuple storage |
Date: | 2012-01-30 09:06:57 |
Message-ID: | 20120130.180657.220412574.horiguchi.kyotaro@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for comments, this is revised version of the patch.
The gain of performance is more than expected. Measure script now
does query via dblink ten times for stability of measuring, so
the figures become about ten times longer than the previous ones.
sec % to Original
Original : 31.5 100.0%
RowProcessor patch : 31.3 99.4%
dblink patch : 24.6 78.1%
RowProcessor patch alone makes no loss or very-little gain, and
full patch gives us 22% gain for the benchmark(*1).
The modifications are listed below.
- No more use of PGresAttValue for this mechanism, and added
PGrowValue instead. PGresAttValue has been put back to
libpq-int.h
- pqAddTuple() is restored as original and new function
paAddRow() to use as RowProcessor. (Previous pqAddTuple
implement had been buggily mixed the two usage of
PGresAttValue)
- PQgetRowProcessorParam has been dropped. Contextual parameter
is passed as one of the parameters of RowProcessor().
- RowProcessor() returns int (as bool, is that libpq convension?)
instead of void *. (Actually, void * had already become useless
as of previous patch)
- PQsetRowProcessorErrMes() is changed to do strdup internally.
- The callers of RowProcessor() no more set null_field to
PGrowValue.value. Plus, the PGrowValue[] which RowProcessor()
receives has nfields + 1 elements to be able to make rough
estimate by cols->value[nfields].value - cols->value[0].value -
something. The somthing here is 4 * nfields for protocol3 and
4 * (non-null fields) for protocol2. I fear that this applies
only for textual transfer usage...
- PQregisterRowProcessor() sets the default handler when given
NULL. (pg_conn|pg_result).rowProcessor cannot be NULL for its
lifetime.
- initStoreInfo() and storeHandler() has been provided with
malloc error handling.
And more..
- getAnotherTuple()@fe-protocol2.c is not tested utterly.
- The uniformity of the size of columns in the test data prevents
realloc from execution in dblink... More test should be done.
regards,
=====
(*1) The benchmark is done as follows,
==test.sql
select dblink_connect('c', 'host=localhost dbname=test');
select * from dblink('c', 'select a,c from foo limit 2000000') as (a text b bytea) limit 1;
...(repeat 9 times more)
select dblink_disconnect('c');
==
$ for i in $(seq 1 10); do time psql test -f t.sql; done
The environment is
CentOS 6.2 on VirtualBox on Core i7 965 3.2GHz
# of processor 1
Allocated mem 2GB
Test DB schema is
Column | Type | Modifiers
--------+-------+-----------
a | text |
b | text |
c | bytea |
Indexes:
"foo_a_bt" btree (a)
"foo_c_bt" btree (c)
test=# select count(*),
min(length(a)) as a_min, max(length(a)) as a_max,
min(length(c)) as c_min, max(length(c)) as c_max from foo;
count | a_min | a_max | c_min | c_max
---------+-------+-------+-------+-------
2000000 | 29 | 29 | 29 | 29
(1 row)
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
libpq_rowproc_20120130.patch | text/x-patch | 19.1 KB |
libpq_rowproc_doc_20120130.patch | text/x-patch | 5.5 KB |
dblink_use_rowproc_20120130.patch | text/x-patch | 11.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2012-01-30 09:25:53 | Re: Hot standby off of hot standby? |
Previous Message | Hitoshi Harada | 2012-01-30 08:42:26 | Re: Patch: Allow SQL-language functions to reference parameters by parameter name |