xml-file as foreign table?

From: Johann Spies <johann(dot)spies(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: xml-file as foreign table?
Date: 2016-04-28 12:32:18
Message-ID: CAGZ55DTorNLRvs0BnCV_WjRRWYw5RxUONkSF8r9DSqR=FZzAHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have several large (7GB+) xml files to get into an SQL database.

The xml-files can contain up to 500 000 subrecords which I want to be able
to query in the database.

They are too large to do something like this:

insert into rawxml (xml) select XMLPARSE (DOCUMENT
CONVERT_FROM(PG_READ_BINARY_FILE('FOO.xml' ), 'UTF8'));

If it were possible, each file would be one huge record in the table which
can then be unpacked using XPATH.

The options I am considering is :

1. Unpack the individual records (will be more than 50 million) using
something like python with lxml and psycopg2 and insert them after dropping
all indexes and triggers on the table

2. Unpack the individual records and write a (very) large tsv-file and then
insert it using 'copy'

It would be convenient If I could use the present xml files as 'foreign
tables' and parse them using the xpath-capabilities of PostgreSQL.

Is this possible?

Of not, what would be the most efficient way of getting the individual
xml-records xml into a table into a xml-type field?

The documentation did not give me a clear answer.

Regards
Johann
--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alex Ignatov 2016-04-28 13:26:13 Re: Does this perf output seem 'normal'?
Previous Message Francisco Olarte 2016-04-28 09:41:16 Re: psql color hostname prompt