From: | Heath Lord <heath(dot)lord(at)crunchydata(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: REL_13_STABLE Windows 10 Regression Failures |
Date: | 2020-10-30 19:47:23 |
Message-ID: | CA+BEBhsbe37XMAF2ChA9aoOE6RwSR15mkYFUy9_bPXoZM3bPkw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Tom,
We are working to set up our environment to allow us to get a stack
trace as we do not have any of the Visual Studios stuff installed
right now. However, I thought I would send you a little more
information while we are trying to get that working.
Going through the stats_ext.sql file line by line with a freshly
built REL_13_STABLE database stood up we have determined that running
any of the following commands back to back will cause the database to
crash:
CREATE STATISTICS tst ON relnatts + relpages FROM pg_class;
CREATE STATISTICS tst ON (relpages, reltuples) FROM pg_class;
If you run another command in between them like:
SELECT version();
Then it will not crash when you run either of those commands again.
However if you run any combination of those 2 commands back to back it
will crash the database. The output from the psql instance after
stepping through the stats_ext.sql file is in the
stats_ext_psql_output.txt file attached.
The information from the postgres logfile for the above is attached
in the pg_logfile_output.txt file.
Hopefully, this will at least give you some information while we
are working on getting the backtrace. Thanks.
-Heath
On Fri, Oct 30, 2020 at 1:25 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Heath Lord <heath(dot)lord(at)crunchydata(dot)com> writes:
> > When building from source on a Windows 10 VM using MinGW (8.1.0), I
> > get a random number of regression failures off the REL_13_STABLE
> > branch. I debugged this a little bit and found out that the "random"
> > number of failures is fully dependent on the machine and if I disable
> > the "stats_ext.sql" regression test; all other tests pass without
> > issue. When the "stats_ext.sql" regression test runs, it causes a
> > database exception and PostgreSQL crashes.
>
> Hmph ... it's weird that we have not seen this in the buildfarm.
> Have you tried to extract any info from the crash, like a stack trace?
>
> > I did some digging and determined that on the REL_13_STABLE branch
> > this instability was introduced with this commit
> > "b380484a850b6bf7d9fc0d85c555a2366e38451f"[1]. This corresponds to
> > commit "19f5a37b9fc48a12c77edafb732543875da2f4a3"[1] on master. I
> > worked backwards from there to determine when the regressions stopped
> > failing and determined that with commit
> > "be0a6666656ec3f68eb7d8e7abab5139fcd47012"[2] the regression tests are
> > no longer failing.
>
> I'm having a hard time believing that b380484a8 would have introduced
> a portability problem, and an even harder time believing that be0a66666
> would have resolved it if so. What seems more likely is that there's
> some underlying issue such as a memory stomp, that the first commit
> accidentally exposed and the second one accidentally hid again.
> So, even if back-patching be0a66666 seemed feasible from a stability
> standpoint (which I don't think it is), I fear it'd just mask a problem
> that would eventually bite us again.
>
> So I think we need to dig down and try to identify the root cause,
> without any preconceptions about how to fix it. Again, a stack trace
> would be pretty useful. Or at least some info about which step of
> stats_ext.sql is crashing.
>
> regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
pg_logfile_output.txt | text/plain | 3.2 KB |
stats_ext_psql_output.txt | text/plain | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-10-30 19:51:23 | Re: segfault with incremental sort |
Previous Message | luis.roberto | 2020-10-30 19:38:39 | Re: segfault with incremental sort |