Weirdness using Executor Hooks

From: Eric Ridge <eebbrr(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Weirdness using Executor Hooks
Date: 2015-06-18 20:36:17
Message-ID: CANcm6waOgyFxYyZeVjnJs3WRPJPnf0kLmVVmhNhLy9cKnacYDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've written an extension that hooks ExecutorStart_hook and
ExecutorEnd_hook. The hooks are assigned in _PG_init() (and the previous
ones saved to static vars) and reset to the previous values in _PG_fini().
Maybe also of interest is the extension library is set in postgresql.conf
as a local_preload_libraries. This is with Postgres 9.3.4.

What happens is that rarely (and of course never on my development
machine), the saved "prev_ExecutorXXXHook" gets set to the current value of
ExecutorXXX_hook, so when my hook function is called:

static void my_executor_start_hook(QueryDesc *queryDesc, int eflags)
{
executorDepth++;

if (prev_ExecutorStartHook) /* this ends up equal to
my_executor_start_hook, so it recurses forever */
prev_ExecutorStartHook(queryDesc, eflags);
else
standard_ExecutorStart(queryDesc, eflags);
}

it endless loops on itself (ie, prev_ExecutorStartHook ==
my_executor_start_hook). Based on GDB backtraces, it looks like gcc
compiles this into some form of tail recursion as the backtraces just sit
on the line that calls prev_ExecutorXXXHook(...). The backend has to be
SIGKILL'd.

I've followed the patterns set forth in both the 'auto_explain' and
'pg_stat_statements' contrib extensions and I've been over my code about 3
dozen times now, and I just can't figure out what's going on. Clearly the
hooks are known-to-work, and I'm stumped.

One theory I have is that I've got a bug somewhere that's overwriting
memory, but it's quite the coincidence that only the two saved prev hook
pointers are being changed and being changed to very specific values.

Since it only happens rarely (and never for me during development), another
theory is based on the fact that this extension is under pretty constant
development/deployment and when we deploy a new binary (and run ALTER
EXTENSION UPDATE) we don't restart Postgres and so maybe the
already-active-and-initialized-with-the-previous-version backends are
getting confused (maybe the kernel re-mmaps the .so or something, I
dunno?). I always seem to hear about the problem after a backend has been
endlessly spinning for a few days. :(

Have any of y'all seen anything like this and could I be on the right track
with my second theory?

*scratching head*,

eric

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-06-18 20:42:23 Re: Weirdness using Executor Hooks
Previous Message Tom Lane 2015-06-18 20:04:20 Re: Inheritance planner CPU and memory usage change since 9.3.2