From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Query cancel seems to be broken in master since Oct 17 |
Date: | 2016-10-18 14:03:39 |
Message-ID: | 21605.1476799419@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> On 10/18/2016 04:13 PM, Tom Lane wrote:
>> There's a smoking gun in the postmaster log:
>> 2016-10-18 09:10:34.547 EDT [18502] LOG: wrong key in cancel request for process 18491
> Ok, I've reverted that commit for now. It clearly needs more thought,
> because of this, and the pademelon failure discussed on the other thread.
I think that was an overreaction. The problem is pretty obvious after
adding some instrumentation:
2016-10-18 09:57:47.508 EDT [21229] LOG: wrong key (0x7B7E4D5E, expected 0xF0F804017B7E4D5E) in cancel request for process 21228
To wit, the various cancel_key backend variables are declared as "long",
and the new code
if (!pg_strong_random(&MyCancelKey, sizeof(MyCancelKey)))
is therefore computing an 8-byte random value on 64-bit-long machines.
But only 4 bytes go to the client and come back.
The cleanest fix might be to change those various "long" variables
to uint32. You'd have to think about how to handle the ntohl/htonl
calls that are used on them, though.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Euler Taveira | 2016-10-18 14:26:37 | Re: Move pg_largeobject to a different tablespace *without* turning on system_table_mods. |
Previous Message | Merlin Moncure | 2016-10-18 13:45:55 | Re: emergency outage requiring database restart |