From: | armand pirvu <armand(dot)pirvu(at)gmail(dot)com> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | foreign tables query performance using postgres_fdw |
Date: | 2017-11-15 20:59:00 |
Message-ID: | 69F15382-0FDE-4424-A63A-7D42431D0795@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hi
I apologize beore hand if it is in the wrong list and if please advise which one to post in.
I am facing fhe following issue and trying to understand what is wrong. My setup below
1 -
CREATE EXTENSION postgres_fdw;
2 -
CREATE SERVER birst_levreg
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '172.16.26.4', dbname 'birstdb');
CREATE USER MAPPING FOR csidba
SERVER birst_levreg
OPTIONS (user 'csidba', password 'secretpwd');
CREATE FOREIGN TABLE dim_item
(
show_id character varying(100),
client_id integer,
item_id character varying(100) NOT NULL,
item_type character varying(100),
item_code character varying(100),
item_name character varying(250),
item_group character varying(100),
location_id character varying(100),
session_start_date timestamp without time zone,
session_end_date timestamp without time zone,
credit_hours numeric,
total_allotted integer,
update_datetime timestamp without time zone,
is_deleted character varying(10),
item_source character varying(1)
)
SERVER birst_levreg OPTIONS (table_name 'dim_item');
grant select,update,delete,insert on dim_item_birst to public;
analyze dim_item;
On remote server I already have
CREATE TABLE dim_item (
show_id character varying(100),
client_id integer,
item_id character varying(100) NOT NULL,
item_type character varying(100),
item_code character varying(100),
item_name character varying(250),
item_group character varying(100),
location_id character varying(100),
session_start_date timestamp without time zone,
session_end_date timestamp without time zone,
credit_hours numeric,
total_allotted integer,
update_datetime timestamp without time zone,
is_deleted character varying(10),
item_source character varying(1)
);
ALTER TABLE ONLY dim_item
ADD CONSTRAINT dim_item_pkey PRIMARY KEY (item_id);
CREATE INDEX dim_item_idx ON dim_item USING btree (client_id, update_datetime);
3 -
on remote server
explain analyze select * from dim_item where item_id='156GIEPE14CX-B';
Index Scan using dim_item_pkey on dim_item (cost=0.42..2.44 rows=1 width=157) (actual time=0.134..0.135 rows=1 loops=1)
Index Cond: ((item_id)::text = '156GIEPE14CX-B'::text)
Planning time: 1.836 ms
Execution time: 0.333 ms
on local server-foreign table
explain analyze select * from dim_item where item_id='156GIEPE14CX-B';
Foreign Scan on dim_item_birst (cost=100.00..105.53 rows=1 width=1934) (actual time=1.049..1.050 rows=1 loops=1)
Planning time: 0.226 ms
Execution time: 1.626 ms
Even if I use use_remote_estimate 'true' in the server and foreign table definition the same result
I did also
explain (analyze,verbose) select * from dim_item where item_id='156GIEPE14CX-B';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------
Foreign Scan on csischema.dim_item_birst (cost=100.00..2731.33 rows=1 width=157) (actual time=1.053..1.054 rows=1 loops=1)
Output: show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id, session_start_date, session_end_date, credit_hours, total_allotted, upda
te_datetime, is_deleted, item_source
Remote SQL: SELECT show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id, session_start_date, session_end_date, credit_hours, total_all
otted, update_datetime, is_deleted, item_source FROM csischema.dim_item WHERE ((item_id = '156GIEPE14CX-B'::text))
Planning time: 0.222 ms
Execution time: 1.842 ms
(5 rows)
And on remote server
explain analyze
SELECT show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id, session_start_date,
session_end_date, credit_hours, total_allotted, update_datetime, is_deleted, item_source FROM csischema.dim_item WHERE ((item_id = '156GIEPE14CX-B'::text))
--------------------------------------------------------------------------------------------------------------------------
Index Scan using dim_item_pkey on dim_item (cost=0.42..2.44 rows=1 width=157) (actual time=0.146..0.147 rows=1 loops=1)
Index Cond: ((item_id)::text = '156GIEPE14CX-B'::text)
Planning time: 1.949 ms
Execution time: 0.456 ms
(4 rows)
So what am I missing ? Why do I get a scan using foreign table as opposed to a pkey scan ?
Shall I understand that in fact the plan I am seing on the local server I should just ignore it, since the processing is in fact done on the remote(foreign) server and the difference between costs is nothing else but the overhead in between ?
From the manual I got this
"When use_remote_estimate is true, postgres_fdw obtains row count and cost estimates from the remote server and then adds fdw_startup_cost and fdw_tuple_cost to the cost estimates. When use_remote_estimate is false, postgres_fdw performs local row count and cost estimation and then adds fdw_startup_cost and fdw_tuple_cost to the cost estimates. This local estimation is unlikely to be very accurate unless local copies of the remote table's statistics are available. Running ANALYZE on the foreign table is the way to update the local statistics; this will perform a scan of the remote table and then calculate and store statistics just as though the table were local. Keeping local statistics can be a useful way to reduce per-query planning overhead for a remote table — but if the remote table is frequently updated, the local statistics will soon be obsolete.
"
Many thanks
Armand
From | Date | Subject | |
---|---|---|---|
Next Message | rammohan ganapavarapu | 2017-11-15 21:23:57 | Re: Can master and slave on different PG versions? |
Previous Message | Klaus P. Pieper | 2017-11-15 19:21:44 | Re: Performance difference between servers |