Re: DBLink: interesting issue

From: Joe Conway <mail(at)joeconway(dot)com>
To: Oleg Lebedev <oleg(dot)lebedev(at)waterford(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DBLink: interesting issue
Date: 2002-09-20 23:36:30
Message-ID: 3D8BB0FE.2010206@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Oleg Lebedev wrote:
> I am using dbLink library to synchronize tables in two databases. Below
> are two identical in terms of their results queries, one is using
> NOT EXISTS and another NOT IN. The later one is 100 time slower then the
> former one according to EXPLAIN. However, the former one throws an
> interesting exception every time I run it. Interestingly, it works fine
> if the local table is empty. Any ideas why it happens?

I need more information. What version of PostgreSQL, and is dblink the version
distributed with Postgres, or is it a later copy from CVS?

I have a machine running 7.2.2 and a CVS (Version 0.4, ~mid April, 2002) copy
of dblink, and cannot reproduce the failure:

testmst=# select version();
version
-------------------------------------------------------------
PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

testslv=# SELECT objectid FROM activity WHERE NOT EXISTS (SELECT remoteid FROM
(SELECT CAST (dblink_tok(t1.dblink_p, 0) AS int8) AS remoteid FROM (SELECT
dblink('dbname=testmst','SELECT objectid FROM activity') AS dblink_p) t1) a1
WHERE remoteid=activity.objectid);
objectid
----------
(0 rows)

Early in 7.3 development I updated dblink in a 7.2 backward compatible way.
Unfortunately, it wasn't practical to keep it backward compatible to 7.2.x, so
the current CVS HEAD version will no longer work with 7.2.x. If you would like
to try dblink Version 0.4, let me know and I'll send you a patch.

As a side note, 7.3 (currently in beta) has much improved semantics for use of
dblink, made possible by the new table function capability. E.g.:

regression_slave=# select version();
version
----------------------------------------------------------------------------------------------------------------
PostgreSQL 7.3b1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.1
20020205 (Red Hat Linux Rawhide 3.1-0.21)
(1 row)

regression_slave=# SELECT objectid FROM activity WHERE NOT EXISTS (SELECT
remoteid FROM dblink('dbname=regression','SELECT objectid FROM activity') AS
t1(remoteid int8) WHERE remoteid=activity.objectid);
objectid
----------
5
(1 row)

Or even better performance-wise:

regression_slave=# SELECT objectid FROM activity WHERE NOT EXISTS (SELECT
remoteid FROM dblink('dbname=regression','SELECT objectid FROM activity WHERE
objectid = ' || activity.objectid) AS t1(remoteid int8));
objectid
----------
5
(1 row)

If you can, please give the beta (Postgres and dblink) a try (not for
production yet, of course, but for feedback).

Thanks,

Joe

In response to

Browse pgsql-general by date

  From Date Subject
Next Message ljb 2002-09-21 01:34:22 Re: Permissions with multiple groups...
Previous Message Oleg Lebedev 2002-09-20 22:19:55 DBLink: interesting issue