Re: Buildfarm issues on specific machines

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: darcy(at)dbitech(dot)ca, remi_zara(at)mac(dot)com, books(at)ejurka(dot)com, markw(at)osdl(dot)org, decibel(at)decibel(dot)org, pete(at)economics(dot)utoronto(dot)ca, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Buildfarm issues on specific machines
Date: 2005-07-17 15:02:40
Message-ID: 42DA7310.6070404@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Tom,

thanks for this. I regularly send out private emails about what appear
to be local issues.

Tom Lane wrote:

>I spent a little time today cleaning up easily-fixed problems that are
>causing buildfarm failures in various back branches. Hopefully that
>will result in a few more "green" entries over the new few days. While
>I was looking, I noticed several machines that seem to be failing
>because of local issues:
>
>potoroo [HEAD, 7.4]: lock file "/tmp/.s.PGSQL.65432.lock" already exists
>
>I'm not sure if this is a problem with a stale lock file left around
>from an old run, or if it happens because the machine is configured to
>try to build/test several branches in parallel. In any case, it might
>be worthwhile to try to hack the buildfarm script so that the Unix
>socket files are allocated in a per-branch scratch directory, not in
>/tmp. Or try to change pg_regress to use different port numbers for
>different branches?
>
>

Buildfarm is set up so that each branch gets its own non-standard port.
It also prevents you (via a lock file) from running concurrently on a
branch unless you have multiple repositories. So only "make check"
should have any problems here, not any of the other tests, and only
because the port for that set of tests is hardcoded.

So I think the right (and certainly simplest) thing is to make the tmp
port relative to the default port. Something like what is below - I
chose 5 instead of 6 as the leading digit to avoid possible overflow
where the default port number is a high 4 digit number. I guess if we
were paranoid we could also check for 5 digit default port numbers.

cheers

andrew

Index: src/test/regress/GNUmakefile
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/GNUmakefile,v
retrieving revision 1.49
diff -c -r1.49 GNUmakefile
*** src/test/regress/GNUmakefile 11 May 2005 21:52:03 -0000 1.49
--- src/test/regress/GNUmakefile 17 Jul 2005 14:52:59 -0000
***************
*** 45,50 ****
--- 45,51 ----
-e 's,@libdir@,$(libdir),g' \
-e 's,@pkglibdir@,$(pkglibdir),g' \
-e 's,@datadir@,$(datadir),g' \
+ -e 's,@default_port@,$(default_port),g' \
-e 's/@VERSION@/$(VERSION)/g' \
-e 's/@host_tuple@/$(host_tuple)/g' \
-e 's,@GMAKE@,$(MAKE),g' \
Index: src/test/regress/pg_regress.sh
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/pg_regress.sh,v
retrieving revision 1.58
diff -c -r1.58 pg_regress.sh
*** src/test/regress/pg_regress.sh 25 Jun 2005 23:04:06 -0000 1.58
--- src/test/regress/pg_regress.sh 17 Jul 2005 14:52:59 -0000
***************
*** 342,348 ****
unset PGHOST
unset PGHOSTADDR
fi
! PGPORT=65432
export PGPORT

# Get rid of environment stuff that might cause psql to misbehave
--- 342,348 ----
unset PGHOST
unset PGHOSTADDR
fi
! PGPORT=5(at)default_port@
export PGPORT

# Get rid of environment stuff that might cause psql to misbehave

>osprey [HEAD]: could not create shared memory segment: Cannot allocate memory
>DETAIL: Failed system call was shmget(key=2, size=1957888, 03600).
>
>Kernel shmem settings too small...
>
>dragonfly [HEAD]: libz link error
>
>As per earlier discussion, I maintain this is local misconfiguration.
>
>cobra [7.4, 7.3, 7.2]: --with-tcl but no Tk
>
>Possibly adding --without-tk to the configure options is the right answer.
>Otherwise, install Tk or remove --with-tcl.
>
>cuckoo [7.3, 7.2]: --enable-nls without OS support
>
>This looks like pilot error; but the later branches don't fail on this
>machine, so did we change something in this area?
>
>caribou [7.2]: no "flex" installed
>
>This looks like pilot error as well, though again I don't understand why the
>later branches seem to work. Are we sure the same PATH is being used for
>every branch here? Why doesn't the buildfarm report for 7.2 show the PATH?
>
> regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-07-17 15:13:35 Re: Buildfarm
Previous Message Tom Lane 2005-07-17 15:02:33 Re: Buildfarm issues on specific machines