nested xml/json to table

From: Wim Bertels <wim(dot)bertels(at)ucll(dot)be>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: nested xml/json to table
Date: 2023-03-17 10:05:18
Message-ID: 05b3f3beccba4ce16d2a4ad411085ae5c46e2e82.camel@ucll.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

in python pandas there is for example a json_normalize function,
i didn't find something similar or better in postgresql?

what would be the general idea: "easily" convert an hierarchical
structure like json or xml to a table; for example creating columns by
appending the key-names when going doing down the three, using null for
empty values, adding more columns as needed by the given structure.
(1-way operation)

a few conceptual gists:
jsonX=
{
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Sta..";
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-m..",
"GlossSeeAlso": 
["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}

select *
from json_to_table('jsonX');
-- generated columns with no data/only nulls could be removed..
-- arrays could be unnested in the process as well

glossary | glossary.title | glossary.title.GlossDiv.title | ..
-------------------------------------------------------------
null | example glossary | S |
..

the last column: glossary.title.GlossDiv.GlossList.GlossEntry.GlossSee
with value "markup"

---

what if there are different structures that need to be combined?
(they could be added in the same manner as before)

jsonY=
{
s1:[{
"f1": "a",
"f2": "b",
"f3": { "f3.1": "c",
"f3.2": "d"}
},
{
"f1": "e",
"f4": "g"
}
]
}

select *
from json_to_table('jsonY');
-- generated columns with no data/only nulls could be removed..
-- separator sign is untrusted

s1 | s1.f1 | s1.f2 | s1.f3 | s1.f3.f3.1 | s1.f3.f3.2 | s1.f4
-------------------------------------------------------------
null| a | b | null | c | d | null
null| e | null | null | null | null | g

any ideas or suggestions (apart from plpython)?
Wim

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2023-03-17 10:21:23 Re: nested xml/json to table
Previous Message Andreas Joseph Krogh 2023-03-17 07:56:51 Delete values from JSON