Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Umair Shahid <umair(dot)shahid(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)
Date: 2016-06-24 02:21:32
Message-ID: CAMsr+YHwFDSd7vTESe=w7Fs0u=ErkwkeFoz-GdgCLiqA+CHaTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 June 2016 at 05:17, Umair Shahid <umair(dot)shahid(at)gmail(dot)com> wrote:

> On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid <
> umair(dot)shahid(at)2ndquadrant(dot)com> wrote:
>
>>
>> ---------- Forwarded message ----------
>> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>> Date: Thu, Jun 23, 2016 at 9:32 PM
>> Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready
>> To: Magnus Hagander <magnus(at)hagander(dot)net>
>> Cc: Umair Shahid <umair(dot)shahid(at)2ndquadrant(dot)com>, Dave Page <
>> dpage(at)postgresql(dot)org>, PostgreSQL Packagers <
>> pgsql-packagers(at)postgresql(dot)org>
>>
>>
>> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> > That makes more sense as the joinrel stuff *has* been changed between
>> the
>> > two betas. I'm sure someone who's touched that code (Tom?) can comment
>> on
>> > that part..
>>
>> It still makes little sense to me, as the previous reports say that the
>> problem happened during bootstrap, and the planner does not run
>> during bootstrap.
>>
>> Could we get a look at debug_query_string in the coredump, to possibly
>> narrow down where the crash is really happening?
>>
>
> Moving thread to -hackers ...
>
> debug_query_string is
>
> * "INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid,
> t.description FROM tmp_pg_description t, pg_class c WHERE c.relname =
> t.classname;"*
>
> Happening in "setup_description"
>
>

I was helping Haroon with this last night. I don't have access to the
original thread and he's not around so I don't know how much he said. I'll
repeat our findings here.

During debugging I found that:

* A VS 2013 build (perfomed by Haroon and copied to the test host) crashes
consistently with the reported symptoms - "performing post-bootstrap
initialization ... child process was terminated by exception 0xC0000005"

* The issue doesn't happen in a VS 2015 build done on the test host

* I couldn't use just-in-time debugging because the restricted execution
token setup isolated the process. For the same reason, breakpoints stop
working in initdb.c after line 3557.

* To get a backtrace, I had to:

* Launch a VS x86 command prompt
* devenv /debugexe bin\initdb.exe -D test
* Set a breakpoint in initdb.c:3557 and initdb.c:3307
* Run
* When it traps at get_restricted_token(), manually move the execution
pointer over the setup of the restricted execution token by dragging &
dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
comment it out and rebuild, but I was working with a supplied binary.
* Continue until next breakpoint
* Launch process explorer and find the pid of the postgres child process
* Debug->attach to process, attach to the child postgres. This doesn't
detach the parent, VS does multiprocess debugging.
* Continue execution
* vs will trap on the child when it crashes

* It is an access violation (segfault) in postgres.exe when attempting to
read memory at 0xFFFFFFFFFFFFFFFF in calc_joinrel_size_estimate() at
costsize.c:3940

fkselec = get_foreign_key_join_selectivity(root,
outer_rel->relids,
inner_rel->relids,
sjinfo,
&restrictlist);

with debug_query_string:

0x0000000009bf6140 "INSERT INTO pg_description SELECT t.objoid, c.oid,
t.objsubid, t.description FROM tmp_pg_description t, pg_class c WHERE
c.relname = t.classname;\n"

Backtrace:

Exception thrown at 0x00000001401A5A81 in postgres.exe: 0xC0000005:
Access violation reading location 0xFFFFFFFFFFFFFFFF.

> postgres.exe!calc_joinrel_size_estimate(PlannerInfo * root, RelOptInfo *
outer_rel, RelOptInfo * inner_rel, double outer_rows, double inner_rows,
SpecialJoinInfo * sjinfo, List * restrictlist) Line 3944 C
postgres.exe!set_joinrel_size_estimates(PlannerInfo * root, RelOptInfo *
rel, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo *
sjinfo, List * restrictlist) Line 3852 C
postgres.exe!build_join_rel(PlannerInfo * root, Bitmapset * joinrelids,
RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo,
List * * restrictlist_ptr) Line 521 C
postgres.exe!make_join_rel(PlannerInfo * root, RelOptInfo * rel1,
RelOptInfo * rel2) Line 721 C
postgres.exe!make_rels_by_clause_joins(PlannerInfo * root, RelOptInfo *
old_rel, ListCell * other_rels) Line 266 C
postgres.exe!join_search_one_level(PlannerInfo * root, int level) Line 69
C
postgres.exe!standard_join_search(PlannerInfo * root, int levels_needed,
List * initial_rels) Line 2172 C
postgres.exe!query_planner(PlannerInfo * root, List * tlist,
void(*)(PlannerInfo *, void *) qp_callback, void * qp_extra) Line 255 C
postgres.exe!grouping_planner(PlannerInfo * root, char
inheritance_update, double tuple_fraction) Line 1695 C
postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse,
PlannerInfo * parent_root, char hasRecursion, double tuple_fraction) Line
775 C
postgres.exe!standard_planner(Query * parse, int cursorOptions,
ParamListInfoData * boundParams) Line 312 C
postgres.exe!pg_plan_query(Query * querytree, int cursorOptions,
ParamListInfoData * boundParams) Line 800 C
postgres.exe!exec_simple_query(const char * query_string) Line 1023 C
postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname,
const char * username) Line 4076 C
postgres.exe!main(int argc, char * * argv) Line 227 C

Local vars:

+ inner_rel 0x0000000009dfd170 {type=T_EquivalenceClass (537)
reloptkind=RELOPT_BASEREL (0) relids=0x0000000009d6d718 {...} ...} RelOptInfo
*
inner_rows 270.00000000000000 double
+ outer_rel 0x00000001401ded48
{postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel,
RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
outer_rows 2.653352065130e-314#DEN double
+ restrictlist 0x0000000009d6f7f8 {type=T_List (656) length=1
head=0x0000000009d6f7d8 {data={ptr_value=0x0000000009d6e980 ...} ...} ...} List
*
+ root 0x0000000009dfd800 {type=1 parse=0x000000000067d220
{type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo
*
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543)
min_lefthand=0x0000000009dfcfd8 {nwords=1 words=0x0000000009dfcfdc {...} }
...} SpecialJoinInfo *

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-06-24 02:23:39 Re: Hash Indexes
Previous Message Joshua D. Drake 2016-06-24 02:16:58 Odd behavior with domains