Re: Insertion of large xml files into PostgreSQL 10beta1

From: Alain Toussaint <atoussaint1976(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Insertion of large xml files into PostgreSQL 10beta1
Date: 2017-06-26 03:02:41
Message-ID: CAGo4VQ+bSGMk37zDgpvRkmTueJayuxv4x7gjKeZTssGz01hDgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Narrowing down the entire file to a small problem region and posting a
> self-contained example,

The url here contain the set of xml records from a publication I
worked on many years ago:

https://www.ncbi.nlm.nih.gov/pubmed/21833294?report=xml&format=text

The particularly problematic region of the xml content is this:

<CommentsCorrectionsList>
<CommentsCorrections RefType="Cites">
<RefSource>Neuroreport. 2000 Sep 11;11(13):2969-72</RefSource>
<PMID Version="1">11006976</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>J Neurosci. 2005 May 25;25(21):5148-58</RefSource>
<PMID Version="1">15917455</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Neuroimage. 2003 Dec;20(4):1944-54</RefSource>
<PMID Version="1">14683700</PMID>
</CommentsCorrections>

There is more of these type of comments in an given citation.

> or at least providing the error messages and
> content, might help elicit good responses.

here it is:

ERROR: syntax error at or near "44"
LINE 1: 44(1):37-43</RefSources>

the command I used is this one:

echo "INSERT INTO samples (xmldata) VALUES $(cat
/srv/pgsql/pubmed/medline17n0001.xml)" | /usr/bin/psql medline
1>/dev/null 2>error.log

wc -l error.log
11145 error.log

The error message given is repeated a metric ton of time but I didn't
check the entire log if there were other kind of error messages.

> Even if you could load the data
> without incident using it make end up proving problematic.

Agreed, the box will definitely need more ram and I could be better
off with a more recent graphic card (nvidia or amd but whatever is
supported by tensorflow 1.2 and up). I'll figure it out as I go.

Many thanks.

Alain

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Arthur Zakirov 2017-06-26 09:10:01 Re: Configure Qt Creator to work with PostgreSQL to extensions development
Previous Message Berend Tober 2017-06-26 01:07:43 Re: Question regarding pgsql-general mailing list.