Re: What is happening on buildfarm member dugong

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: What is happening on buildfarm member dugong
Date: 2007-09-11 13:46:42
Message-ID: 27305.1189518402@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Sergey E. Koposov" <math(at)sai(dot)msu(dot)ru> writes:
> On Tue, 11 Sep 2007, Tom Lane wrote:
>> dugong has been failing contribcheck repeatably for the last day or so,
>> with a very interesting symptom: CREATE DATABASE is failing with

> The reason for that is that I've been trying to switch from 9.1 to 10.0
> version of the ICC compiler.

Hah, interesting.

> Few notes:
> 1) without the --enable-cassert everything works
> 2) with --enable-cassert it, the only thing that fails in the tests is
> contrib-installcheck...
> 3) And recently I tried to compile PG also with -O0 flag, it actually
> worked.
> 4) Also, just now I tried to compile PG 8.2.4 and the same problem occurs.

> So, I can either completely switch back to 9.1 and forget it, or we
> can try to find or at least localize this bug(if it is ICC fault). But to
> do that, I need some advices/help, how to do it better...

Well, the first thing I'd suggest is trying to localize which Assert
makes it fail. From the bug's behavior I think it is highly probable
that the problem is in fsync signalling, which puts it either in
bgwriter.c or md.c. Try recompiling those modules separately without
cassert (leaving all else enabled) and see if the problem comes and
goes; if so, comment out one Assert at a time till you find which one.

Actually ... another possibility is that it's not directly an Assert,
but CLOBBER_FREED_MEMORY that exposes the bug. (This would suggest
that the compiler is trying to re-order memory accesses around a pfree.)
So before you get into the one-assert-at-a-time test, try with
--enable-cassert but modify pg_config_manual.h to not define
CLOBBER_FREED_MEMORY.

This could be a compiler bug, or it could be our fault --- might need
a "volatile" on some pointer or other, for example, to prevent the
compiler from making an otherwise legitimate assumption. So it seems
worth chasing it down.

BTW, does ICC have any switch corresponding to gcc's -fno-strict-aliasing?
I see that configure tries to feed that switch to it, but it might
want some other spelling.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-09-11 13:53:24 Re: Per-function search_path => per-function GUC settings
Previous Message Simon Riggs 2007-09-11 13:33:27 Re: [HACKERS] Final Thoughts for 8.3 on LWLocking and Scalability