Re: Problem with PostgreSQL 9.2.7 and make check on AIX 7.1

From: Rainer Tammer <pgsql(at)spg(dot)schulergroup(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org, cbbrowne(at)gmail(dot)com
Subject: Re: Problem with PostgreSQL 9.2.7 and make check on AIX 7.1
Date: 2014-02-25 17:06:15
Message-ID: 530CCD87.9080407@spg.schulergroup.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,
I will try to get some debug code in the SIGINT handler.
In the wort case I will start a system trace.

In the meantime I have build more version on AIX 6.1.
I can see no failure on AIX 6.1, including 9.2.7.
I have installed the same C/C++ compiler on the AIX 6.1 box
as I have on the AIX 7.1 box - still same result. All test are OK.

So the problem is not compiler dependent.

Currently I upgrade a AIX 7.1 test LPAR on a Power 5 box.
This way I can check if the problem is dependant on AIX 7.1
or Power 7+. (Some years ago there was a problem with
Java on the newer CPUs.)

What code path is executed if the timeout passes and
the signal is send?

- Where exactly is the signal send?
--> Is the signal really send?

- Where is the first entry in the handler?
--> Do we receive the signal?

Bye
Rainer

P.S.: What do you think of a smoker on the IBM developer cloud?
If this is interesting I might organize the setup.

On 25.02.2014 17:35, Tom Lane wrote:
> Rainer Tammer <pgsql(at)spg(dot)schulergroup(dot)com> writes:
>> 1. fast shutdown
>> The unexpected "LOG: received fast shutdown request" is happening
>> on an installed instance. I have found other articles on the WEB which
>> describe the same problem (other platform) - unfortunately there was
>> no real solution to it.
> As far as we can tell, any SIGINT of the postmaster must be coming from
> outside the Postgres code. There are several places where SIGINT is
> generated internally, but I've just been through all of them again and
> it's pretty nearly impossible to believe that they could target the
> postmaster process rather than some child process. If you've got a
> way to instrument it and find out where the signal came from (eg what
> PID sent it), that would be interesting information.
>
>> 2. hang during make check
>> PostgreSQL 8.4.20 -> make check does finish without hang
>> PostgreSQL 9.0.16 -> hang
>> PostgreSQL 9.2.7 -> hang
> Interesting, since AFAIR there was no major surgery on the timeout
> code in 9.0. If you'd said 9.3 broke it, that wouldn't be so
> surprising ...
>
>> As far as I can see the hang is caused by the "set statement_timeout to
>> 2000;"
>> statement. Where would be a good start point to diagnose this problem??
> Well, the point is that the timeout is failing to happen. Is the SIGALRM
> signal being blocked? If it is delivered, why doesn't that spring the
> process off its wait? Anyway, I see you already found enable_sig_alarm
> and CheckStatementTimeout, so those are reasonable places to start
> injecting some additional logging. You might also need to instrument
> the backend SIGINT handler, StatementCancelHandler in postgres.c.
>
> regards, tom lane
>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2014-02-25 17:06:25 Re: BUG #9342: CPU / Memory Run-away
Previous Message Tom Lane 2014-02-25 16:35:32 Re: Problem with PostgreSQL 9.2.7 and make check on AIX 7.1