Re: dblink performance

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
Cc: Alex Bible <Alex(dot)Bible(at)ctg(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: dblink performance
Date: 2011-12-07 21:49:24
Message-ID: CAHyXU0zJJa8mJ=SipWSp4bLbPnFVkLPKH_XGssCU8BSrevoC0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Dec 7, 2011 at 3:37 PM, Marc Mamin <M(dot)Mamin(at)intershop(dot)de> wrote:
>> >
>> > I find dblink being a nice tool as long as the data volume to
>> transfer
>> > remains low.
>> > I've evaluated it to implement a clustered Postgres environment, but
>> > gave it up due to the poor performances.
>> > Still waiting for the binary transfer before the next try ;-)
>>
>> Binary transfer is not a super big deal in terms of performance
>> actually in the general case. It's only substantially faster in a few
>> cases like timestamp, geo types, and of course bytea.  Lack of
>> parameterization I find to be a bigger deal actually -- it's more of a
>> usability headache than a performance thing.
>>
>> Also FYI binary dblink between databases is going to be problematic
>> for any non built in type unless the type oids are synchronized across
>> databases.
>>
>> merlin
>
> Thanks,
> ... so I don't really understand where all the time get lost in the
> example I posted a few weeks ago:
>
> http://archives.postgresql.org/pgsql-general/2011-09/msg00436.php

you wrote:
"select count(*) from
(
select * from dblink('a','select * from test_cluster')as t1(a int)
union all
select * from dblink('a','select * from test_cluster')as t1(a int)
union all
select * from dblink('a','select * from test_cluster')as t1(a int)
)foo

is about 5 times slower than an equivalent query run locally.

working with asynchron. queries (dblink_send_query) does not bring much
benefit
so that much time seems to be spent for transfer and merge"

it's not exactly fair to compare dblink to local query -- dblink
method requires having to marshal all the data over the protoocl and
un-marshal on the other end. I was seeing 3-5x times difference vs
local query but this is to be expected. note the 'union all' had
nothing to so with your performance problems. Also the querying
server can do a very special trick for count(*) since it only needs to
check tuple visibility that can't be done when doing select count(*)
from (<dblink_queries>).

My point up thread is that dblink is going to be comparable to other
methods that involve querying the data off the server and doing the
processing on the client side.

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jason Long 2011-12-08 02:41:22 Query Slowdown after upgrade from 9.1.1 to 9.1.2
Previous Message Pavel Stehule 2011-12-07 21:44:26 Re: Tables creation date and time