Re: 3000x Slower query when using Foreign Data Wrapper vs. local

From: Mohammad Habbab <moh(dot)habbab(at)gmail(dot)com>
To: desmodemone <desmodemone(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 3000x Slower query when using Foreign Data Wrapper vs. local
Date: 2015-10-11 10:58:16
Message-ID: CA+AzKzaKNZecrJ2iVqGFnksst0wdAQdn8kzg9=5qbGC=KwuBZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> I am not sure why, by the way I think because you could have the local
tables mixed with the foreign tables, so in that case, you have to use
> the local cost base optimizer
Oh right ! yep, that explains it. Thank you very much !

Best Regards,
Mohammad

On Sun, Oct 11, 2015 at 6:48 PM, desmodemone <desmodemone(at)gmail(dot)com> wrote:

>
>
> 2015-10-11 12:10 GMT+02:00 Mohammad Habbab <moh(dot)habbab(at)gmail(dot)com>:
>
>> Awesome ! Thank you very much, that solved it :) . But, do you have any
>> idea why this isn't enabled by default ?
>> As a first time user for FDW I would assume that usage of remote
>> estimates would be enabled by default because they would be more
>> authoritative and more representative of access patterns. Correct ?
>>
>> Best Regards,
>> Mohammad
>>
>> On Sun, Oct 11, 2015 at 5:42 PM, desmodemone <desmodemone(at)gmail(dot)com>
>> wrote:
>>
>>> Hi Mohammad,
>>> I think it's not enable
>>> "use_remote_estimate" during the creation of the foreign table
>>>
>>> http://www.postgresql.org/docs/9.4/static/postgres-fdw.html
>>>
>>> use_remote_estimate
>>>
>>> This option, which can be specified for a foreign table or a foreign
>>> server, controls whether postgres_fdw issues remote EXPLAIN commands to
>>> obtain cost estimates. A setting for a foreign table overrides any setting
>>> for its server, but only for that table. The default is false.
>>>
>>>
>>> try it
>>>
>>>
>>> Bye
>>>
>>>
>>> 2015-10-11 10:05 GMT+02:00 Mohammad Habbab <moh(dot)habbab(at)gmail(dot)com>:
>>>
>>>> Hi there,
>>>>
>>>> If it's possible, I would really appreciate any hints or help on an
>>>> issue I've been facing lately.
>>>> I'm running two instance of Postgres locally: 9.4.4 (operational db)
>>>> and 9.5beta1 (analytical db). I've already imported schema to analytical db
>>>> and while doing the following query I find very different query plans being
>>>> executed:
>>>>
>>>> Query:
>>>>
>>>> EXPLAIN ANALYZE VERBOSE SELECT
>>>> o.id AS id,
>>>> o.company_id AS company_id,
>>>> o.created_at::date AS created_at,
>>>> COALESCE(o.assignee_id, 0) AS assignee_id,
>>>> (o.tax_treatment)::text AS tax_treatment,
>>>> COALESCE(o.tax_override, 0) AS tax_override,
>>>> COALESCE(o.stock_location_id, 0) AS stock_location_id,
>>>> COALESCE(l.label, 'N/A')::text AS stock_location_name,
>>>> COALESCE(sa.country, 'N/A')::text AS shipping_address_country,
>>>> COALESCE(o.tags, ARRAY[]::text[]) AS tags
>>>> FROM orders AS o
>>>> INNER JOIN locations AS l ON l.id = o.stock_location_id
>>>> INNER JOIN addresses AS sa ON sa.id = o.shipping_address_id
>>>> WHERE o.account_id = <some_value> AND l.account_id = <another_value>
>>>> LIMIT 10;
>>>>
>>>>
>>>> Plan when I run it locally on operational db:
>>>>
>>>> Limit (cost=747.62..811.46 rows=1 width=76) (actual
>>>> time=28.208..28.397 rows=10 loops=1)
>>>> Output: o.id, o.company_id, ((o.created_at)::date),
>>>> (COALESCE(o.assignee_id, 0)), ((o.tax_treatment)::text),
>>>> (COALESCE(o.tax_override, 0::numeric)), (COALESCE(o.stock_location_id, 0)),
>>>> ((COALESCE(l.label, 'N/A'::character varying))::text),
>>>> ((COALESCE(sa.country, 'N/A'::character varying))::text), (COALESCE(o.tags,
>>>> '{}'::character varying[]))
>>>> -> Nested Loop (cost=747.62..811.46 rows=1 width=76) (actual
>>>> time=28.208..28.395 rows=10 loops=1)
>>>> Output: o.id, o.company_id, (o.created_at)::date,
>>>> COALESCE(o.assignee_id, 0), (o.tax_treatment)::text,
>>>> COALESCE(o.tax_override, 0::numeric), COALESCE(o.stock_location_id, 0),
>>>> (COALESCE(l.label, 'N/A'::character varying))::text, (COALESCE(sa.country,
>>>> 'N/A'::character varying))::text, COALESCE(o.tags, '{}'::character
>>>> varying[])
>>>> -> Nested Loop (cost=747.19..807.15 rows=1 width=73) (actual
>>>> time=28.164..28.211 rows=10 loops=1)
>>>> Output: o.id, o.company_id, o.created_at, o.assignee_id,
>>>> o.tax_treatment, o.tax_override, o.stock_location_id, o.tags,
>>>> o.shipping_address_id, l.label
>>>> -> Index Scan using index_locations_on_account_id on
>>>> public.locations l (cost=0.29..8.31 rows=1 width=20) (actual
>>>> time=0.025..0.025 rows=1 loops=1)
>>>> Output: l.id, l.address1, l.address2, l.city,
>>>> l.country, l.zip_code, l.suburb, l.state, l.label, l.status, l.latitude,
>>>> l.longitude, l.created_at, l.updated_at, l.account_id, l.holds_stock
>>>> Index Cond: (l.account_id = 18799)
>>>> -> Bitmap Heap Scan on public.orders o
>>>> (cost=746.90..798.71 rows=13 width=57) (actual time=28.133..28.176 rows=10
>>>> loops=1)
>>>> Output: o.id, o.account_id, o.company_id, o.status,
>>>> o.invoice_number, o.reference_number, o.due_at, o.issued_at, o.user_id,
>>>> o.notes, o.created_at, o.updated_at, o.order_number, o.billing_address_id,
>>>> o.shipping_address_id, o.payment_status, o.email, o.fulfillment_status,
>>>> o.phone_number, o.assignee_id, o.tax_treatment, o.tax_override,
>>>> o.tax_label_override, o.stock_location_id, o.currency_id, o.source,
>>>> o.source_url, o.demo, o.invoice_status, o.ship_at, o.source_id, o.search,
>>>> o.default_price_list_id, o.contact_id, o.return_status, o.tags,
>>>> o.packed_status, o.returning_status, o.shippability_status,
>>>> o.backordering_status
>>>> Recheck Cond: ((o.stock_location_id = l.id) AND
>>>> (o.account_id = 18799))
>>>> Heap Blocks: exact=7
>>>> -> BitmapAnd (cost=746.90..746.90 rows=13 width=0)
>>>> (actual time=23.134..23.134 rows=0 loops=1)
>>>> -> Bitmap Index Scan on
>>>> index_orders_on_stock_location_id_manual (cost=0.00..18.02 rows=745
>>>> width=0) (actual time=9.282..9.282 rows=40317 loops=1)
>>>> Index Cond: (o.stock_location_id = l.id)
>>>> -> Bitmap Index Scan on
>>>> index_orders_on_account_id (cost=0.00..718.94 rows=38735 width=0) (actual
>>>> time=9.856..9.856 rows=40317 loops=1)
>>>> Index Cond: (o.account_id = 18799)
>>>> -> Index Scan using addresses_pkey on public.addresses sa
>>>> (cost=0.43..4.30 rows=1 width=11) (actual time=0.015..0.016 rows=1
>>>> loops=10)
>>>> Output: sa.id, sa.company_id, sa.address1, sa.city,
>>>> sa.country, sa.zip_code, sa.created_at, sa.updated_at, sa.suburb, sa.state,
>>>> sa.label, sa.status, sa.address2, sa.phone_number, sa.email,
>>>> sa.company_name, sa.latitude, sa.longitude, sa.first_name, sa.last_name
>>>> Index Cond: (sa.id = o.shipping_address_id)
>>>> Planning time: 1.136 ms
>>>> Execution time: 28.621 ms
>>>> (23 rows)
>>>>
>>>> Plan when I run it from analytical db via FDW:
>>>>
>>>> Limit (cost=300.00..339.95 rows=1 width=1620) (actual
>>>> time=7630.240..82368.326 rows=10 loops=1)
>>>> Output: o.id, o.company_id, ((o.created_at)::date),
>>>> (COALESCE(o.assignee_id, 0)), ((o.tax_treatment)::text),
>>>> (COALESCE(o.tax_override, '0'::numeric)), (COALESCE(o.stock_location_id,
>>>> 0)), ((COALESCE(l.label, 'N/A'::character varying))::
>>>> text), ((COALESCE(sa.country, 'N/A'::character varying))::text),
>>>> (COALESCE(o.tags, '{}'::character varying[]))
>>>> -> Nested Loop (cost=300.00..339.95 rows=1 width=1620) (actual
>>>> time=7630.238..82368.314 rows=10 loops=1)
>>>> Output: o.id, o.company_id, (o.created_at)::date,
>>>> COALESCE(o.assignee_id, 0), (o.tax_treatment)::text,
>>>> COALESCE(o.tax_override, '0'::numeric), COALESCE(o.stock_location_id, 0),
>>>> (COALESCE(l.label, 'N/A'::character varying))::text,
>>>> (COALESCE(sa.country, 'N/A'::character varying))::text,
>>>> COALESCE(o.tags, '{}'::character varying[])
>>>> Join Filter: (o.shipping_address_id = sa.id)
>>>> Rows Removed by Join Filter: 19227526
>>>> -> Nested Loop (cost=200.00..223.58 rows=1 width=1108)
>>>> (actual time=69.758..69.812 rows=10 loops=1)
>>>> Output: o.id, o.company_id, o.created_at,
>>>> o.assignee_id, o.tax_treatment, o.tax_override, o.stock_location_id,
>>>> o.tags, o.shipping_address_id, l.label
>>>> Join Filter: (o.stock_location_id = l.id)
>>>> Rows Removed by Join Filter: 18
>>>> -> Foreign Scan on remote.orders o
>>>> (cost=100.00..111.67 rows=1 width=592) (actual time=68.009..68.014 rows=10
>>>> loops=1)
>>>> Output: o.id, o.account_id, o.company_id,
>>>> o.status, o.invoice_number, o.reference_number, o.due_at, o.issued_at,
>>>> o.user_id, o.notes, o.created_at, o.updated_at, o.order_number,
>>>> o.billing_address_id, o.shipping_address
>>>> _id, o.payment_status, o.email, o.fulfillment_status, o.phone_number,
>>>> o.assignee_id, o.tax_treatment, o.tax_override, o.tax_label_override,
>>>> o.stock_location_id, o.currency_id, o.source, o.source_url, o.demo,
>>>> o.invoice_status, o.ship_at, o
>>>> .source_id, o.search, o.default_price_list_id, o.contact_id,
>>>> o.return_status, o.tags, o.packed_status, o.returning_status,
>>>> o.shippability_status, o.backordering_status
>>>> Remote SQL: SELECT id, company_id, created_at,
>>>> shipping_address_id, assignee_id, tax_treatment, tax_override,
>>>> stock_location_id, tags FROM public.orders WHERE ((account_id = 18799))
>>>> -> Foreign Scan on remote.locations l
>>>> (cost=100.00..111.90 rows=1 width=520) (actual time=0.174..0.174 rows=3
>>>> loops=10)
>>>> Output: l.id, l.address1, l.address2, l.city,
>>>> l.country, l.zip_code, l.suburb, l.state, l.label, l.status, l.latitude,
>>>> l.longitude, l.created_at, l.updated_at, l.account_id, l.holds_stock
>>>> Remote SQL: SELECT id, label FROM public.locations
>>>> WHERE ((account_id = 18799))
>>>> -> Foreign Scan on remote.addresses sa (cost=100.00..114.50
>>>> rows=150 width=520) (actual time=0.634..8029.415 rows=1922754 loops=10)
>>>> Output: sa.id, sa.company_id, sa.address1, sa.city,
>>>> sa.country, sa.zip_code, sa.created_at, sa.updated_at, sa.suburb, sa.state,
>>>> sa.label, sa.status, sa.address2, sa.phone_number, sa.email,
>>>> sa.company_name, sa.latitude, sa.l
>>>> ongitude, sa.first_name, sa.last_name
>>>> Remote SQL: SELECT id, country FROM public.addresses
>>>> Planning time: 0.209 ms
>>>> Execution time: 82391.610 ms
>>>> (21 rows)
>>>>
>>>> Time: 82393.211 ms
>>>>
>>>> What am I doing wrong ? really appreciate any guidance possible. Thank
>>>> you very much for taking the time to helping me with this.
>>>>
>>>> Best Regards,
>>>> Mohammad
>>>>
>>>
>>>
>>>
>>> --
>>> Matteo Durighetto
>>>
>>> - - - - - - - - - - - - - - - - - - - - - - -
>>>
>>> Italian PostgreSQL User Group <http://www.itpug.org/index.it.html>
>>> Italian Community for Geographic Free/Open-Source Software
>>> <http://www.gfoss.it>
>>>
>>
>>
>>
>> --
>> Mohammad Habbab
>> Bangsar, KL, Malaysia
>> Mobile No. +601111582144
>> Email: moh(dot)habbab(at)gmail(dot)com
>> LinkedIn: https://www.linkedin.com/in/mohammadhabbab
>>
>
>
> Hi,
> I am not sure why, by the way I think because you could have the
> local tables mixed with the foreign tables, so in that case, you have to use
> the local cost base optimizer [if you not rewrite query with CTE with only
> the fdw tables and use so the use_remote_estimate], and so you need local
> statistics of local and remote table [ infact you could also analyze fdw
> table and store the statistics in local dictionary ].
>
> In your case I see you have all fdw tables, so it makes more sense to use
> remote cost base optmizer.
>
>
> Have a nice day
> --
> Matteo Durighetto
>
> - - - - - - - - - - - - - - - - - - - - - - -
>
> Italian PostgreSQL User Group <http://www.itpug.org/index.it.html>
> Italian Community for Geographic Free/Open-Source Software
> <http://www.gfoss.it>
>

--
Mohammad Habbab
Bangsar, KL, Malaysia
Mobile No. +601111582144
Email: moh(dot)habbab(at)gmail(dot)com
LinkedIn: https://www.linkedin.com/in/mohammadhabbab

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Shaun Thomas 2015-10-12 18:17:53 Having some problems with concurrent COPY commands
Previous Message desmodemone 2015-10-11 10:48:58 Re: 3000x Slower query when using Foreign Data Wrapper vs. local