Re: FDW, too long to run explain

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Vijaykumar Jain <vjain(at)opentable(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: FDW, too long to run explain
Date: 2019-02-17 18:58:47
Message-ID: CAMkU=1zokhfktj6KMtyHirjjD15VCab5=e=3uQYkHJuTp15K_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Feb 17, 2019 at 12:41 PM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Mon, Feb 4, 2019 at 2:15 AM Vijaykumar Jain <vjain(at)opentable(dot)com>
> wrote:
>
>>
>> now we have some long running queries via FDW that take minutes and get
>> killed explain runs as idle in transaction on remote servers.
>>
>
> Are you saying the EXPLAIN itself gets killed, or execution of the plan
> generated based on the EXPLAIN (issued under use_remote_estimate = true)
> gets killed? Who is doing the killing, the local side or the foreign
> side? Can you include verbatim log entries for this?
>

After thinking about it a bit more, I think I see the issue here. The
EXPLAIN pursuant to use_remote_estimate is issued in the same remote
transaction as the following DECLARE and FETCH's are. But after the
EXPLAIN is issued, the local server executes the query for a different FDW
to satisfy some other branch of the UNION ALL, giving the first FDW
connection time to do an idle-in-transaction timeout. This happens even if
no rows need to fetched from that FDW, because another branch of the UNION
ALL satisfied the LIMIT.

A question for the PostgreSQL hackers would be, Is it necessary and
desirable that the EXPLAIN be issued in the same transaction as the
eventual DECLARE and FETCHes? I don't think it is. I guess if the foreign
side table definition got changed between EXPLAIN and DECLARE it would
cause problems, but changing the foreign side definition out of sync with
the local side can cause problems anyway, so is that important to preserve?

Changing that might narrow but not completely fix the problem, as there
might still be delays between the DECLARE and the FETCH or between
successive FETCHes.

So a question for you would be, why do have such an aggressive setting
for idle_in_transaction_session_timeout that it causes this to happen?
Couldn't you relax it, perhaps just for the role used for the FDW
connections?

Cheers,

Jeff

>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2019-02-17 19:09:16 Re: FDW, too long to run explain
Previous Message Andres Freund 2019-02-17 18:54:54 Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2