Re: Indices and Foreign Tables

From: Torsten Förtsch <torsten(dot)foertsch(at)gmx(dot)net>
To: "Kohler Manuel (ID SIS)" <manuel(dot)kohler(at)id(dot)ethz(dot)ch>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Indices and Foreign Tables
Date: 2014-04-04 19:54:51
Message-ID: 533F0E0B.2090105@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 04/04/14 17:14, Kohler Manuel (ID SIS) wrote:
> Here is the same query with fdw:
>
> db=# EXPLAIN VERBOSE select ds.code, count(*), sum(dsf.size_in_bytes) as "raw_size",pg_size_pretty(sum(dsf.size_in_bytes)) as "size" from data_set_files_fdw dsf, data_sets_fdw ds where dsf.parent_id is null and dsf.dase_id=ds.id group by ds.code order by raw_size desc;
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------
> Sort (cost=327.81..327.97 rows=64 width=40)
> Output: ds.code, (count(*)), (sum(dsf.size_in_bytes)), (pg_size_pretty(sum(dsf.size_in_bytes)))
> Sort Key: (sum(dsf.size_in_bytes))
> -> HashAggregate (cost=325.09..325.89 rows=64 width=40)
> Output: ds.code, count(*), sum(dsf.size_in_bytes), pg_size_pretty(sum(dsf.size_in_bytes))
> -> Hash Join (cost=270.61..324.45 rows=64 width=40)
> Output: dsf.size_in_bytes, ds.code
> Hash Cond: (ds.id = (dsf.dase_id)::bigint)
> -> Foreign Scan on public.data_sets_fdw ds (cost=100.00..148.40 rows=1280 width=40)
> Output: ds.id, ds.code, ds.location
> Remote SQL: SELECT id, code FROM public.data_sets
> -> Hash (cost=170.48..170.48 rows=10 width=16)
> Output: dsf.size_in_bytes, dsf.dase_id
> -> Foreign Scan on public.data_set_files_fdw dsf (cost=100.00..170.48 rows=10 width=16)
> Output: dsf.size_in_bytes, dsf.dase_id
> Filter: (dsf.parent_id IS NULL)
> Remote SQL: SELECT dase_id, parent_id, size_in_bytes FROM public.data_set_files
> (17 rows)
>
> And also with ANALYZE:
>
> db=# EXPLAIN ANALYZE select ds.code, count(*), sum(dsf.size_in_bytes) as "raw_size",pg_size_pretty(sum(dsf.size_in_bytes)) as "size" from data_set_files_fdw dsf, data_sets_fdw ds where dsf.parent_id is null and dsf.dase_id=ds.id group by ds.code order by raw_size desc;
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
> Sort (cost=327.81..327.97 rows=64 width=40) (actual time=219401.864..219414.641 rows=13839 loops=1)
> Sort Key: (sum(dsf.size_in_bytes))
> Sort Method: quicksort Memory: 2283kB
> -> HashAggregate (cost=325.09..325.89 rows=64 width=40) (actual time=219327.664..219363.709 rows=13839 loops=1)
> -> Hash Join (cost=270.61..324.45 rows=64 width=40) (actual time=219127.848..219277.308 rows=13839 loops=1)
> Hash Cond: (ds.id = (dsf.dase_id)::bigint)
> -> Foreign Scan on data_sets_fdw ds (cost=100.00..148.40 rows=1280 width=40) (actual time=1.057..77.415 rows=13839 loops=1)
> -> Hash (cost=170.48..170.48 rows=10 width=16) (actual time=219126.713..219126.713 rows=13839 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 649kB
> -> Foreign Scan on data_set_files_fdw dsf (cost=100.00..170.48 rows=10 width=16) (actual time=1082.614..219083.326 rows=13839 loops=1)
> Filter: (parent_id IS NULL)
> Rows Removed by Filter: 35726596
> Total runtime: 219438.925 ms
> (1

Do you know that you can combine VERBOSE and ANALYZE in one EXPLAIN?

EXPLAIN (ANALYZE, VERBOSE) SELECT ...

The problem is that your qualifier is not pushed down to the backend
database. So, you pull (35726596+13839) rows (which is the complete
table) from data_set_files in the remote database, filter them locally
and throw 35726596 of them away. Then you copy the remote data_sets
table (13839 rows). And then you join them.

I don't think use_remote_estimate can help a lot in this situation.

We tried to use postgres_fdw in a similar approach last year. Then
switched to dblink because, although it is possible for FDW to push
qualifiers to the backend, it too often does not do so. Then it copies
large tables over the network and scans them sequentially on the local
machine. Also, LIMIT for instance is never pushed to the backend.

Now, we create functions like this:

CREATE OR REPLACE FUNCTION __get_client_metrics_on(srv TEXT)
RETURNS TABLE(loginid TEXT,
...)
AS $def$

SELECT tb.*
FROM dblink($1, $$

-- remote query starts here

SELECT c.loginid, ...
FROM clients c
JOIN ...

-- remote query ends here

$$) AS tb(loginid TEXT,
...)

$def$ LANGUAGE sql VOLATILE SECURITY definer ROWS ... COST ...;

CREATE OR REPLACE FUNCTION get_client_metrics()
RETURNS TABLE(srv TEXT,
loginid TEXT,
...)
AS $def$

SELECT s.srvname, rem.*
FROM production_servers() s
CROSS JOIN __get_client_metrics_on(s.srvname) rem

$def$ LANGUAGE sql STABLE SECURITY invoker;

The production_servers function is defined more or less like this:

CREATE OR REPLACE FUNCTION production_servers()
RETURNS TABLE(srvname TEXT) AS $def$

SELECT s.srvname::TEXT
FROM pg_catalog.pg_foreign_data_wrapper w
JOIN pg_catalog.pg_foreign_server s ON (s.srvfdw=w.oid)
WHERE w.fdwname IN ('postgres_fdw', 'dblink_fdw')

$def$ LANGUAGE sql STABLE;

Then you can

SELECT ... FROM get_client_metrics() JOIN ... WHERE ...

It is much more work than simply import tables from the backend as
foreign tables. But for us it worked very well and we achieved in many
cases much better performance than before. This new approach replaces a
system where all the production databases pour their data into a large
one using bucardo only to do some analysis on the complete data set.

And there is more you can achieve with dblink. For instance you can
execute the remote queries concurrently (you don't even need PLPGSQL for
that, pure SQL is sufficient).

Torsten

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Raymond O'Donnell 2014-04-04 19:54:57 Re: Understanding setof
Previous Message Scott Marlowe 2014-04-04 19:51:35 Re: SSD Drives