Re: BUG #14584: Segmentation fault importing large XML file

From: Jorge Solórzano <jorsol(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14584: Segmentation fault importing large XML file
Date: 2017-03-08 19:17:37
Message-ID: CA+cVU8Nm7Of6UuZH79pogesrh0MVVM8G01FNejBiGKKBphb1oA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Pavel,

By large I mean big in size: 935M
Posts.xml: XML 1.0 document, UTF-8 Unicode text, with very long lines

I installed debug symbols for libxml2 if this helps:

#0 xmlParserPrintFileContextInternal (input=input(at)entry=0x55afc89ef4b0,
channel=0x55afc636ca30 <appendStringInfo>, data=0x55afc89e5cc0) at
../../error.c:181
cur = <optimized out>
base = <optimized out>
n = <optimized out>
col = <optimized out>
content =
"\000\004\000\000\000\000\000\000\260\364\236ȯU\000\000\000\000\000\000\000\000\000\000\330\340\236ȯU\000\000`\203D*\375\177\000\000M(at)XƯU
\000\000\300\\\236ȯU\000\000\260\364\236ȯU\000\000\"\000\000\000\000\000\000\000\332\370\023\340\241\177\000\000\240"
ctnt = <optimized out>
#1 0x00007fa1e00a587a in xmlParserPrintFileContext__internal_alias
(input=input(at)entry=0x55afc89ef4b0) at ../../error.c:231
No locales.
#2 0x000055afc6542a1c in xml_errorHandler (data=0x55afc89e59b0,
error=<optimized out>) at
/build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:1661
errFuncSaved = 0x7fa1e00a41b0 <xmlGenericErrorDefaultFunc>
errCtxSaved = 0x0
xmlerrcxt = 0x55afc89e59b0
ctxt = <optimized out>
input = 0x55afc89ef4b0
node = <optimized out>
name = <optimized out>
domain = <optimized out>
level = <optimized out>
errorBuf = 0x55afc89e5cc0
__func__ = "xml_errorHandler"
#3 0x00007fa1e00a5fa4 in __xmlRaiseError (schannel=0x55afc6542920
<xml_errorHandler>, schannel(at)entry=0x0, channel=channel(at)entry=0x0,
data=0x55afc89e59b0, data(at)entry=0x0, ctx=ctx(at)entry=0x55afc89ede80,
nod=nod(at)entry=0x0, domain=domain(at)entry=1, code=1, level=XML_ERR_FATAL,
file=0x0, line=838090, str1=0x7fa1e01cc24d "Huge input lookup", str2=0x0,
str3=0x0, int1=0, col=4,
msg=0x7ffd2a4485e0 "internal error: %s\n") at ../../error.c:604
ctxt = <optimized out>
node = 0x0
str = 0x55b08f70e270 "internal error: Huge input lookup\n"
input = <optimized out>
to = 0x55afc89ee0d8
baseptr = 0x0
#4 0x00007fa1e00aa900 in xmlFatalErr (ctxt=ctxt(at)entry=0x55afc89ede80,
error=error(at)entry=XML_ERR_INTERNAL_ERROR, info=info(at)entry=0x7fa1e01cc24d
"Huge input lookup") at ../../parser.c:546
errmsg = <optimized out>
errstr = "internal error: %s\n", '\000' <repetidos 109 veces>
#5 0x00007fa1e00acf14 in xmlGROW (ctxt=0x55afc89ede80) at
../../parser.c:2084
curEnd = <optimized out>
curBase = <optimized out>
#6 0x00007fa1e00c1338 in xmlParseContent__internal_alias
(ctxt=0x55afc89ede80) at ../../parser.c:10101
test = <optimized out>
cons = 0
cur = <optimized out>
#7 0x00007fa1e00c1c13 in xmlParseElement__internal_alias
(ctxt=ctxt(at)entry=0x55afc89ede80)
at ../../parser.c:10255
name = 0x55afc89ef577 "posts"
prefix = 0x0
URI = 0x0
node_info = {node = 0x0, begin_pos = 140333225866765, begin_line =
94213473493376, end_pos = 140333225866817, end_line = 94213473493376}
line = 2
tlen = 5
ret = 0x55afc89efab0
nsNr = 0
#8 0x00007fa1e00c266a in xmlParseDocument__internal_alias
(ctxt=ctxt(at)entry=0x55afc89ede80)
at ../../parser.c:10952
start = "<?xm"
enc = <optimized out>
#9 0x00007fa1e00c9fd9 in xmlDoRead (reuse=1, options=0, encoding=0x0,
URL=0x0, ctxt=0x55afc89ede80) at ../../parser.c:15430
ret = <optimized out>
#10 xmlCtxtReadMemory__internal_alias (ctxt=0x55afc89ede80,
buffer=buffer(at)entry=0x7fa09306f040 "<?xml version=\"1.0\"
encoding=\"utf-8\"?>\n<posts>\n <row Id=\"1\" PostTypeId=\"1\"
AcceptedAnswerId=\"727273\" CreationDate=\"2009-07-15T06:27:46.723\"
Score=\"155\" ViewCount=\"92736\" Body=\"&lt;p&gt;A Vista virtua"...,
size=size(at)entry=979473840, URL=URL(at)entry=0x0, encoding=encoding(at)entry=0x0,
options=options(at)entry=0) at ../../parser.c:15719
input = 0x55afc89ef410

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pavel Stehule 2017-03-08 19:36:08 Re: BUG #14584: Segmentation fault importing large XML file
Previous Message Pavel Stehule 2017-03-08 19:06:26 Re: BUG #14584: Segmentation fault importing large XML file