From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: BackgroundPsql swallowing errors on windows |
Date: | 2025-02-16 22:39:51 |
Message-ID: | 20250216223951.a3.nmisch@google.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Feb 16, 2025 at 03:55:10PM -0500, Andres Freund wrote:
> On 2025-02-16 10:47:40 -0800, Noah Misch wrote:
> > On Sun, Feb 16, 2025 at 01:02:01PM -0500, Andres Freund wrote:
> > > On 2025-02-16 09:39:43 -0800, Noah Misch wrote:
> > > > On Thu, Feb 13, 2025 at 12:39:04PM -0500, Andres Freund wrote:
> > > > > I suspect what's happening is that the communication with the
> > > > > external process allows for reordering between stdout/stderr.
> > > > >
> > > > > And indeed, changing BackgroundPsql::query() to emit the banner on both stdout
> > > > > and stderr and waiting on both seems to fix the issue.
> > > >
> > > > That makes sense. I wondered how one might fix IPC::Run to preserve the
> > > > relative timing of stdout and stderr, not perturbing the timing the way that
> > > > disrupted your test run. I can think of two strategies:
> > > >
> > > > - Remove the proxy.
> > > >
> > > > - Make pipe data visible to Perl variables only when at least one of the
> > > > proxy<-program_under_test pipes had no data ready to read. In other words,
> > > > if both pipes have data ready, make all that data visible to Perl code
> > > > simultaneously. (When both the stdout pipe and the stderr pipe have data
> > > > ready, one can't determine data arrival order.)
> > > >
> > > > Is there a possibly-less-invasive change that might work?
> > > seems proxying all the pipes through one
> > > connection might be an option.
> >
> > It would. Thanks. However, I think that would entail modifying the program
> > under test to cooperate with the arrangement. When running an ordinary
> > program that does write(1, ...) and write(2, ...), the read end needs some way
> > to deal with the uncertainty about which write happened first. dup2(1, 2)
> > solves the order ambiguity, but it loses other signal.
>
> I think what's happening in this case must go beyond just that. Afaict just
> doing ->pump_nb() would otherwise solve it. My uninformed theory is that two
> tcp connections are used. With two pipes
>
> P1: write(1)
> P1: write(2)
> P2: read(1)
> P2: read(2)
>
> wouldn't ever result in P2 not seeing data on either of reads.
True.
> But with two
> TCP sockets there can be time between the send() completing and recv() on the
> other side reading the data, even on a local system (e.g. due to the tcp stack
> waiting a while for more data before sending data).
>
> To avoid that the proxy program could read from N pipes and then proxy them
> through *one* socket by prefixing the data with information about which pipe
> the data is from. Then IPC::Run could split the data again, using the added
> prefix.
>
> I don't think that would require modifying the program under test?
I think that gets an order anomaly when the proxy is slow and the program
under test does this:
write(2, "ERROR: 1")
write(1, "BANNER 1")
write(2, "ERROR: 2")
write(1, "BANNER 2")
If the proxy is sufficiently fast, it will wake up four times and perform two
read() calls on each of the two pipes. On the flip side, if the proxy is
sufficiently slow, it will wake up once and perform one read() on each of the
two pipes. In the slow case, the reads get "ERROR: 1ERROR: 2" and "BANNER
1BANNER 2". The proxy sends the data onward as though the program under test
had done:
write(1, "BANNER 1BANNER 2")
write(2, "ERROR: 1ERROR: 2")
From the slow proxy's perspective, it can't rule out the program under test
having done those two write() calls. The proxy doesn't have enough
information to reconstruct the original four write() calls. What prevents
that anomaly?
If the proxy were to convey this uncertainty so the consumer side knew to
handle these writes as an atomic unit, I think that would work:
write(n, "proxy-begin")
write(n, "fd-1[BANNER 1BANNER 2]")
write(n, "fd-2[ERROR: 1ERROR: 2]")
write(n, "proxy-commit")
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2025-02-16 23:08:07 | Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding |
Previous Message | Peter Smith | 2025-02-16 22:18:28 | Re: Introduce XID age and inactive timeout based replication slot invalidation |