From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Subject: | Re: Simplify backend terminate and wait logic in postgres_fdw test |
Date: | 2021-05-04 07:13:53 |
Message-ID: | CALj2ACWqh2nHzPyzP-bAY+CaAAbtQRO55AQ_4ppGiU_w8iOvTg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, May 4, 2021 at 4:12 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The buildfarm is showing that one of these test queries is not stable
> under CLOBBER_CACHE_ALWAYS:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-05-01%2007%3A44%3A47
>
> of which the relevant part is:
>
> diff -U3 /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/expected/postgres_fdw.out /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/results/postgres_fdw.out
> --- /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/expected/postgres_fdw.out 2021-05-01 03:44:45.022300613 -0400
> +++ /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/results/postgres_fdw.out 2021-05-03 09:11:24.051379288 -0400
> @@ -9215,8 +9215,7 @@
> WHERE application_name = 'fdw_retry_check';
> pg_terminate_backend
> ----------------------
> - t
> -(1 row)
> +(0 rows)
>
> -- This query should detect the broken connection when starting new remote
> -- transaction, reestablish new connection, and then succeed.
Thanks for the report.
> I can reproduce that locally by setting
>
> alter system set debug_invalidate_system_caches_always = 1;
>
> and running "make installcheck" in contrib/postgres_fdw.
> (It takes a good long time to run the whole test script
> though, so you might want to see if running just these few
> queries is enough.)
I can reproduce the issue with the failing case. Issue is that the
backend pid will be null in the pg_stat_activity because of the cache
invalidation that happens at the beginning of the query and hence
pg_terminate_backend returns null on null input.
> There's no evidence of distress in the postmaster log,
> so I suspect this might just be a timing instability,
> e.g. remote process already gone before local process
> looks. If so, it's probably hopeless to make this
> test stable as-is. Perhaps we should just take it out.
Actually, that test case covers retry code, so removing it worries me.
Instead, I can do as attached i.e. ignore the pg_terminate_backend
output using PERFORM, as the function signals the backend if the given
pid is a valid backend pid and returns on success. If at all, the
function is to return false, it emits a warning, so it will be caught
in the tests.
And having a retry test case with clobber cache enabled doesn't make
sense because all the cache entries are removed/invalidated for each
query, but the test case covers the code on non-clobber cache
platforms, so I would like to keep it.
Please see the attached, it passes with "alter system set
debug_invalidate_system_caches_always = 1;" on my system.
Thoughts?
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Stabilize-a-test-case-in-postgres_fdw.patch | application/octet-stream | 5.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2021-05-04 07:19:23 | Re: Enhanced error message to include hint messages for redundant options error |
Previous Message | Andrey Borodin | 2021-05-04 06:58:36 | Re: Incorrect snapshots while promoting hot standby node when 2PC is used |