From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Matt Newell <newellm(at)blur(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, it(at)blur(dot)com |
Subject: | Re: LISTEN denial of service with aborted transaction |
Date: | 2015-09-30 01:58:35 |
Message-ID: | 15532.1443578315@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Matt Newell <newellm(at)blur(dot)com> writes:
> On Monday, September 28, 2015 07:22:26 PM Tom Lane wrote:
>> Possibly. sinval catchup notification works like that, so you could maybe
>> invent a similar mechanism for notify. Beware that we've had to fix
>> performance issues with sinval catchup; you don't want to release a whole
>> bunch of processes to do work at the same time.
> I'll have to think about this more but i don't envision it causing such as
> scenario.
> The extra processing that will happen at signaling time would have been done
> anyway when the aborted transaction is finally rolled back, the only extra
> work is copying the relevant notifications to local memory.
Hm. An idle-in-transaction listening backend will eventually cause a more
serious form of denial of service: once the pg_notify SLRU area fills up
to its maximum of 8GB, no more new messages can be inserted, and every
transaction that tries to do a NOTIFY will abort. However, that's a
documented hazard and we've not really had complaints about it. In any
case, trying to fix that by having such a backend absorb messages into
local memory doesn't seem like a great idea: if it sits for long enough,
its local memory usage will bloat. Eventually you'd have a failure
anyway.
> Regardless it might not be worthwhile since my patch for #1 seems to address
> the symptom that bit me.
I looked at this patch for a bit and found it kind of ugly, particularly
the business about "we allow one process at a time to advance
firstUncommitted by using advancingFirstUncommitted as a mutex".
That doesn't sound terribly safe or performant.
After further thought, I wonder whether instead of having an incoming
listener initialize its "pos" to QUEUE_TAIL, we couldn't have it
initialize to the maximum "pos" among existing listeners (with a floor of
QUEUE_TAIL of course). If any other listener has advanced over a given
message X, then X was committed (or aborted) earlier and the new listener
is not required to return it; so it should be all right to take that
listener's "pos" as a starting point.
The minimum-code-change way to do that would be to compute the max pos
the hard way while Exec_ListenPreCommit holds the AsyncQueueLock, which
it would now have to do in exclusive mode. That's a little bit annoying
but it's surely not much worse than what SignalBackends has to do after
every notification.
Alternatively, we could try to maintain a shared pointer that is always
equal to the frontmost backend's "pos". But I think that would require
more locking during queue reading than exists now; so it's far from clear
it would be a win, especially in systems where listening sessions last for
a reasonable amount of time (so that Exec_ListenPreCommit is infrequent).
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2015-09-30 02:02:38 | Re: [PATCH 5/6] pg_basebackup: allow GetConnection() to make non-replication connections. |
Previous Message | Kyotaro HORIGUCHI | 2015-09-30 01:54:40 | Re: PATCH: index-only scans with partial indexes |