Filter JSONB on Foreign Table in postgres_fdw

From: Luan Huynh <nnhluan(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org, pgsql-sql(at)postgresql(dot)org
Subject: Filter JSONB on Foreign Table in postgres_fdw
Date: 2017-07-18 03:06:56
Message-ID: CAAP3KKHgabCiSGF7+_2OpW-+BqU_3c7Q139WjesQ1UGmxJ3_VQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi guys,
I am using PostgreSQL v.9.6
When using postgres_fdw, I got an issue with "ORDER BY" (here's my question
on stackexchange
<https://dba.stackexchange.com/questions/179744/order-by-too-slow-in-foreign-table-using-postgres-fdw>
).
Please provide me some info to fix that issue ?
Thank you

-------------------------------------------------------------------------------------------------------------------------------------

*Foreign table*

CREATE FOREIGN TABLE user_info (
id bigint ,
info jsonb )
SERVER server_test_fdw OPTIONS(SCHEMA_NAME 'public', TABLE_NAME
'user_info_raw' );-- user_info_raw is a large table (100 million
records, 200 GB)

*Sample data of info column*

{"key1": 1, "key2": 0.678}{"key1": 1, "key2": 1.0}{"key1": 1, "key2":
0.986} {"key1": 2, "key2": 0.75}{"key1": 2, "key2": 0.639}

*Query on foreign table* (*updated*)

SELECT id, info FROM user_infoWHERE info ->> 'key1'= '1' -- OR using
jsonb_extract_path_text(info, 'key1') = '1'ORDER BY id
LIMIT 10;

Limit (cost=10750829.63..10750829.65 rows=10 width=40) (actual
time=550059.320..550059.326 rows=10 loops=1)
-> Sort (cost=10750829.63..10751772.77 rows=377257 width=40)
(actual time=550059.318..550059.321 rows=10 loops=1)
Sort Key: id
Sort Method: top-N heapsort Memory: 26kB
-> Foreign Scan on user_info (cost=100.00..10742677.24
rows=377257 width=40) (actual time=1.413..536718.366 rows=68281020
loops=1)
Filter: ((info ->> 'key1'::text) = '1'::text)
Rows Removed by Filter: 7170443
Planning time: 4.097 ms
Execution time: 550059.597 ms

*Query on user_info_raw* (*updated*)

EXPLAIN ANALYSESELECT id, info FROM user_info_rawWHERE info ->>
'key1'= '1'ORDER BY id
LIMIT 10;

Limit (cost=0.57..1296.95 rows=10 width=59) (actual
time=0.043..0.073 rows=10 loops=1)
-> Index Scan using idx_user_info_raw_info on user_info_raw
(cost=0.57..68882850.88 rows=531346 width=59) (actual
time=0.042..0.070 rows=10 loops=1)
Filter: ((info ->> 'key1'::text) = '1'::text)
Planning time: 0.192 ms
Execution time: 0.102 ms
select pg_size_pretty(pg_table_size('user_info_raw'));
pg_size_pretty ----------------
223 GB

It takes *10 milliseconds* to execute the query on the *user_info_raw* (*remote
server*).

But, It takes *a lot of time* when using theuser_infoforeign table. When I
remove ORDER BY id, the query executes *very fast*.

I think that my query on *foreign table* *should send to the remote server* for
executing, but it's not, I don't know why, may be due to of this excerpt
from postgres_fdw document
<https://www.postgresql.org/docs/9.6/static/postgres-fdw.html>

By default, only WHERE clauses using *built-in operators and functions* will
be considered for execution on the remote server. Clauses involving
*non-built-in
functions* are checked locally after rows are fetched. If such functions
are available on the remote server and can be relied on to produce the same
results as they do locally, performance can be improved by sending such
WHERE clauses for remote execution

Browse pgsql-sql by date

  From Date Subject
Next Message Prashanth Reddy 2017-07-18 14:58:12 Re: Steps to place standby database in read write
Previous Message Prashanth Reddy 2017-07-18 01:27:22 Steps to place standby database in read write