Re: BUG #13490: Segmentation fault on pg_stat_activity

From: Michael Bommarito <michael(at)bommaritollc(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13490: Segmentation fault on pg_stat_activity
Date: 2015-07-13 13:43:39
Message-ID: CAN=rtBjzBb+XXccg2Q+gcfX3sUbuGCg8x45T5iCTf+ON-DibNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Here are locals per frame, all the way back up to BackendStartup, in case
it helps as well. I will be trying to reproduce this upcoming weekend with
a "baseline" version of postgresql.conf and another system with clean
import.

(gdb) bt
#0 get_tle_by_resno (tlist=0x7fd0d5da27c0, resno=resno(at)entry=6) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/parser/parse_relation.c:2832
#1 0x00007fd0d47cb9dd in pullup_replace_vars_callback (var=0x7fd0d5d9e958,
context=0x7fff52170620) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/prep/prepjointree.c:2074
#2 0x00007fd0d481c3ea in replace_rte_variables_mutator (node=<optimized
out>, context=0x7fff52170620) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/rewrite/rewriteManip.c:1149
#3 0x00007fd0d478152c in expression_tree_mutator (node=0x7fd0d5d9e908,
mutator=0x7fd0d481c390 <replace_rte_variables_mutator>,
context=0x7fff52170620) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/nodes/nodeFuncs.c:2769
#4 0x00007fd0d47812b3 in expression_tree_mutator (node=<optimized out>,
mutator=0x7fd0d481c390 <replace_rte_variables_mutator>,
context=0x7fff52170620) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/nodes/nodeFuncs.c:2675
#5 0x00007fd0d481cc64 in replace_rte_variables (node=<optimized out>,
target_varno=<optimized out>, sublevels_up=sublevels_up(at)entry=0,
callback=callback(at)entry=0x7fd0d47cb880 <pullup_replace_vars_callback>,
callback_arg=callback_arg(at)entry=0x7fff521706c0,
outer_hasSubLinks=0x7fd0d5d30d6e "")
at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/rewrite/rewriteManip.c:1115
#6 0x00007fd0d47cd1c7 in pullup_replace_vars (context=0x7fff521706c0,
expr=<optimized out>) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/prep/prepjointree.c:1982
#7 pull_up_simple_subquery (deletion_ok=<optimized out>,
containing_appendrel=0x0, lowest_nulling_outer_join=0x0,
lowest_outer_join=0x0, rte=0x7fd0d5d30ea8, jtnode=<optimized out>,
root=0x7fd0d5d9ee48) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/prep/prepjointree.c:1030
#8 pull_up_subqueries_recurse (root=root(at)entry=0x7fd0d5d9ee48,
jtnode=<optimized out>, lowest_outer_join=lowest_outer_join(at)entry=0x0,
lowest_nulling_outer_join=lowest_nulling_outer_join(at)entry=0x0,
containing_appendrel=containing_appendrel(at)entry=0x0, deletion_ok=<optimized
out>)
at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/prep/prepjointree.c:696
#9 0x00007fd0d47cc989 in pull_up_subqueries_recurse
(root=root(at)entry=0x7fd0d5d9ee48,
jtnode=0x7fd0d5d9e6c0, lowest_outer_join=lowest_outer_join(at)entry=0x0,
lowest_nulling_outer_join=lowest_nulling_outer_join(at)entry=0x0,
containing_appendrel=containing_appendrel(at)entry=0x0, deletion_ok=<optimized
out>,
deletion_ok(at)entry=0 '\000') at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/prep/prepjointree.c:762
#10 0x00007fd0d47cd639 in pull_up_subqueries (root=root(at)entry=0x7fd0d5d9ee48)
at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/prep/prepjointree.c:614
#11 0x00007fd0d47c5014 in subquery_planner (glob=glob(at)entry=0x7fd0d5d9edb8,
parse=parse(at)entry=0x7fd0d5d30d48, parent_root=parent_root(at)entry=0x0,
hasRecursion=hasRecursion(at)entry=0 '\000', tuple_fraction=0,
subroot=subroot(at)entry=0x7fff52170908)
at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/plan/planner.c:374
#12 0x00007fd0d47c5975 in standard_planner (parse=0x7fd0d5d30d48,
cursorOptions=0, boundParams=0x0) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/optimizer/plan/planner.c:229
#13 0x00007fd0d4848034 in pg_plan_query (querytree=<optimized out>,
cursorOptions=cursorOptions(at)entry=0, boundParams=boundParams(at)entry=0x0) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/tcop/postgres.c:809
#14 0x00007fd0d4848124 in pg_plan_queries
(querytrees=querytrees(at)entry=0x7fd0d5d30cf8,
cursorOptions=0, boundParams=boundParams(at)entry=0x0) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/tcop/postgres.c:868
#15 0x00007fd0d4929760 in BuildCachedPlan
(plansource=plansource(at)entry=0x7fd0d5d7d940,
qlist=0x7fd0d5d30cf8, qlist(at)entry=0x0, boundParams=boundParams(at)entry=0x0)
at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/utils/cache/plancache.c:951
#16 0x00007fd0d4929a98 in GetCachedPlan
(plansource=plansource(at)entry=0x7fd0d5d7d940,
boundParams=boundParams(at)entry=0x0, useResOwner=useResOwner(at)entry=0 '\000')
at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/utils/cache/plancache.c:1165
#17 0x00007fd0d48497ab in exec_bind_message (input_message=0x7fff52170be0)
at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/tcop/postgres.c:1774
#18 PostgresMain (argc=<optimized out>, argv=argv(at)entry=0x7fd0d5c8d950,
dbname=0x7fd0d5c8d840 "databasename", username=<optimized out>) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/tcop/postgres.c:4071
#19 0x00007fd0d45f239c in BackendRun (port=0x7fd0d5cd2c00) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/postmaster/postmaster.c:4159
#20 BackendStartup (port=0x7fd0d5cd2c00) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/postmaster/postmaster.c:3835
#21 ServerLoop () at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/postmaster/postmaster.c:1609
#22 0x00007fd0d47f18e1 in PostmasterMain (argc=5, argv=<optimized out>) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/postmaster/postmaster.c:1254
#23 0x00007fd0d45f30cd in main (argc=5, argv=0x7fd0d5c8c970) at
/tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/main/main.c:221
(gdb) info locals
tle = 0x0
l = 0x7fd0d5da2940
(gdb) select-frame 1
(gdb) info locals
tle = <optimized out>
rcon = 0x7fff521706c0
varattno = 6
newnode = <optimized out>
__func__ = "pullup_replace_vars_callback"
(gdb) select-frame 2
(gdb) info locals

newnode = <optimized out>
var = <optimized out>
__func__ = "replace_rte_variables_mutator"
(gdb) select-frame 3
(gdb) info locals

phinfo = <optimized out>
newnode = <optimized out>
__func__ = "expression_tree_mutator"
(gdb) select-frame 4
(gdb) info locals

resultlist = 0x0
temp = 0x7fd0d5d9e8e8
__func__ = "expression_tree_mutator"
(gdb) select-frame 5
(gdb) info locals

result = <optimized out>
context = {callback = 0x7fd0d47cb880 <pullup_replace_vars_callback>,
callback_arg = 0x7fff521706c0, target_varno = 1, sublevels_up = 0,
inserted_sublink = 0 '\000'}
__func__ = "replace_rte_variables"
(gdb) select-frame 6
(gdb) info locals

No locals.
(gdb) select-frame 7
(gdb) info locals

parse = 0x7fd0d5d30d48
subquery = <optimized out>
rvcontext = {root = 0x7fd0d5d9ee48, targetlist = 0x7fd0d5da27c0, target_rte
= 0x7fd0d5d30ea8, relids = 0x0, outer_hasSubLinks = 0x7fd0d5d30d6e "",
varno = 1, need_phvs = 0 '\000', wrap_non_vars = 0 '\000', rv_cache =
0x7fd0d5da2960}
varno = 1
subroot = 0x7fd0d5da2560
lc = <optimized out>
(gdb) select-frame 8
(gdb) info locals

varno = <optimized out>
rte = 0x7fd0d5d30ea8
__func__ = "pull_up_subqueries_recurse"
(gdb) select-frame 9
(gdb) info locals

sub_deletion_ok = <optimized out>
f = 0x7fd0d5d9e6c0
have_undeleted_child = 0 '\000'
l = 0x7fd0d5d9e720
__func__ = "pull_up_subqueries_recurse"
(gdb) select-frame 10
(gdb) info locals

No locals.
(gdb) select-frame 11
(gdb) info locals

root = 0x7fd0d5d9ee48
plan = <optimized out>
newWithCheckOptions = <optimized out>
newHaving = <optimized out>
hasOuterJoins = <optimized out>
l = <optimized out>
(gdb) select-frame 12
(gdb) info locals

result = <optimized out>
glob = 0x7fd0d5d9edb8
tuple_fraction = <optimized out>
root = 0x7fd0d5d9ed18
top_plan = <optimized out>
lp = <optimized out>
lr = <optimized out>
(gdb) select-frame 13
(gdb) info locals

plan = <optimized out>
(gdb) select-frame 14
(gdb) info locals

query = <optimized out>
stmt = <optimized out>
stmt_list = 0x0
query_list = 0x7fd0d5d30d28
(gdb) select-frame 15
(gdb) info locals

plan = <optimized out>
plist = <optimized out>
snapshot_set = 0 '\000'
spi_pushed = <optimized out>
plan_context = <optimized out>
oldcxt = 0x7fd0d5c8ccb0
(gdb) select-frame 16
(gdb) info locals

plan = <optimized out>
qlist = 0x0
customplan = 0 '\000'
__func__ = "GetCachedPlan"
(gdb) select-frame 17
(gdb) info locals

pformats = 0x0
psrc = 0x7fd0d5d7d940
portal = 0x7fd0d5ca6da0
query_string = 0x7fd0d5cd17a0 "SELECT application_name AS source,
client_addr AS ip, COUNT(*) AS total_connections FROM pg_stat_activity
WHERE pid <> pg_backend_pid() GROUP BY application_name, ip ORDER BY
COUNT(*) DESC, applicatio"...
portal_name = 0x7fd0d5d308d0 ""
stmt_name = 0x7fd0d5d308d1 ""
numPFormats = 0
saved_stmt_name = 0x0
rformats = 0x7fd0d5d30ce0
params = 0x0
save_log_statement_stats = 0 '\000'
msec_str = "SELECT
1\000\061\000\324\320\177\000\000\220\r\027R\377\177\000\000ᕔ\324\320\177\000"
numParams = 0
numRFormats = 1
cplan = <optimized out>
snapshot_set = <optimized out>
(gdb) select-frame 18
(gdb) info locals

firstchar = -707588896
input_message = {data = 0x7fd0d5d308d0 "", len = 10, maxlen = 1024, cursor
= 10}
local_sigjmp_buf = {{__jmpbuf = {140734570630176, -6783158886466612332,
140534916634656, 1, 0, 140534916918272, -6783158886368046188,
-6808758974801407084}, __mask_was_saved = 1, __saved_mask = {__val = {0,
140534916918272, 140534862700952, 140534916634800, 206158430256,
140734570630512, 140734570630304,
140534916918272, 52, 140734570630432, 140534895011200, 1024,
140734570630464, 140734570630628, 0, 140734570630400}}}}
send_ready_for_query = 0 '\000'
__func__ = "PostgresMain"
(gdb) select-frame 19
(gdb) info locals

ac = 1
secs = 489934357
usecs = 656467
i = 1
av = 0x7fd0d5c8d950
maxac = <optimized out>

Thanks,
Michael J. Bommarito II, CEO
Bommarito Consulting, LLC
*Web:* http://www.bommaritollc.com
*Mobile:* +1 (646) 450-3387

On Mon, Jul 13, 2015 at 2:54 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Mon, Jul 13, 2015 at 4:16 AM, Michael Bommarito
> <michael(at)bommaritollc(dot)com> wrote:
> > This particular instance is from pghero, which is a monitoring tool. It
> > can be reproduced simply by querying stat_activity in psql as well.
> Pghero
> > is using prepared statements via ruby from a quick skim on their github
> > repo.
> >
> > We have pg_stat_statements enabled, and can reproduce without pghero
> setup
> > as well. No other extensions loaded.
> >
> > On Jul 12, 2015 2:37 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>
> >> Michael Bommarito <michael(at)bommaritollc(dot)com> writes:
> >> > Here's the session with debug_query_string:
> >> > (gdb) printf "%s\n", debug_query_string
> >> > SELECT application_name AS source, client_addr AS ip, COUNT(*) AS
> >> > total_connections FROM pg_stat_activity WHERE pid <> pg_backend_pid()
> >> > GROUP
> >> > BY application_name, ip ORDER BY COUNT(*) DESC, application_name ASC,
> >> > client_addr ASC
> >>
> >> Thanks. This still doesn't match the stack trace: in particular, this
> >> stack frame
> >>
> >> #3 0x00007fd0d478152c in expression_tree_mutator (node=0x7fd0d5d9e908,
> >> mutator=0x7fd0d481c390 <replace_rte_variables_mutator>,
> >> context=0x7fff52170620) at
> >>
> >>
> /tmp/buildd/postgresql-9.5-9.5~alpha1/build/../src/backend/nodes/nodeFuncs.c:2769
> >>
> >> indicates that we found a PlaceHolderInfo node in the expression tree
> that
> >> pullup_replace_vars() was applied to, but so far as I can see no such
> node
> >> should exist in the query tree generated by this query. The most likely
> >> theory seems to be that something clobbered the query tree while it was
> >> sitting in the plancache, causing this recursive function to follow a
> >> bogus pointer. But that doesn't leave us with a lot to go on.
> >>
> >> What can you tell us about the environment this is happening in?
> >> How is the client-side code executing the failing queries? (We know
> >> it's using extended query protocol, but is it preparing a statement
> >> and then executing it repeatedly, or just using a one-shot unnamed
> >> prepared statement?) What nondefault settings are in use on the
> >> server side? Do you have any extensions loaded, such as
> >> pg_stat_statements or auto_explain?
>
> FWIW, I have been fooling around with the query reported in the back
> trace upthread by playing a bit with the extended query protocol to
> send BIND messages with PQdescribePrepared and PQsendDescribePrepared,
> as well as with psql and while I am able to reproduce stack traces
> close to what you had I am not seeing any crashes. I have as well
> played a bit with pghero with pgbench running in parallel and there
> were no problems, with and without pg_stat_statements loaded.
>
> In the backtrace you send previously
> (
> http://www.postgresql.org/message-id/CAN=rtBipwKdHCtmXH3r4GNfUhF9e4ZfJbqcj7s_Ec9e2Mbf_LA@mail.gmail.com
> ),
> what is the value of MyProcPid? Is it 12803 or 20696? If it is the
> former, do you have a backtrace for process 20696? What we may be
> looking at now is actually a side effect of the real problem, and as
> long as we do not have a real test case, I am afraid that finding the
> root problem is rather difficult.
> --
> Michael
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message pete 2015-07-13 14:29:06 BUG #13498: make check failures
Previous Message dmilith 2015-07-13 09:54:27 BUG #13497: Build with dtrace fails