Re: Intermittent errors when fetching cursor rows on PostgreSQL 16

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Enrico Schenone <eschenone(at)cleistech(dot)it>, pgsql-general(at)lists(dot)postgresql(dot)org
Cc: Massimo Catti <mcatti(at)cleistech(dot)it>, Livio Pizzolo <lpizzolo(at)cleistech(dot)it>
Subject: Re: Intermittent errors when fetching cursor rows on PostgreSQL 16
Date: 2025-01-13 17:26:09
Message-ID: 8d65f84d-ddb3-4d0f-be05-44f443500e41@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 1/13/25 08:59, Enrico Schenone wrote:
>
> Il 13/01/25 17:19, Adrian Klaver ha scritto:
>> On 1/13/25 00:45, Enrico Schenone wrote:
>>> Hello, Adrian.
>>> As I said days ago, I have arranged a kind of stress test in
>>> production environment.
>>> I wrote a program that loads a temporary table, loads 2049 rows into
>>> them from a baseline_table and finally declare two nested cursors.
>>> The first cursor is on the temp table as parent while the second is
>>> on a lookup table as child.
>>>
>>> The program logic is the transposition of one fragment of several
>>> production programs that was failing on cursors, and has to be
>>> intended as a POC only.
>>>
>>
>>
>>> And Well, I'm quite confused: no error at all has been detected, not
>>> only on the test programs but in the whole production system. The
>>> error was completely disappeared.
>>>
>>> Then I have stopped the four tasks of the stress test leaving all
>>> other services running for a week, and again no error at all.
>>>
>>> No setup was changed nor servers was rebooted, nor infrastructure has
>>> been upgraded during the test period.
>>
>> You are absolutely sure about the above?
> I can say Yes. All test operations has been logged and verified against
> the Postgresql log.
> The only component not under my control is the Provider's
> Infrastructure, but  the infrastructure admin ensured me that no
> operation at all has been made. I beleave him because it is a reliable
> tecnician end a well known person.

In your OP you stated:

"Production environments can be:

* Distinct application server and DB server on distinct subnets (no
dropped packet detected on firewall, no memory/disk/network failure
detected by "nmon" tool)
* Distinct application server and DB server on same subnet (no firewall)
* Same server for PostgreSQL and applications
"

In all those cases are the various servers all running completely within
the providers infrastructure?

>> Errors that 'fix' themselves are the most frustrating kind, as you
>> know in the back of your mind they will likely pop up again.
> True, knocking again to my door ... I still can't beleave.

Going forward one of three things are likely to happen:

1) The error never shows again.

2) It does show up again but in a manner that allows it to be traced.

3) The worst case, it plays hide and seek as previously.

> Thanks a lot for your interest in sharing my strange experience.
> Best regards.
> Enrico
>
> *Enrico Schenone*
> Software Architect

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Fakarai, Edgar 2025-01-13 19:11:21 pgAgent error on Installation
Previous Message Divyansh Gupta JNsThMAudy 2025-01-13 17:19:41 Re: Need help in logical replication