From: | Thomas Kellerer <shammat(at)gmx(dot)net> |
---|---|
To: | Wim Bertels <wim(dot)bertels(at)ucll(dot)be>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: nested xml/json to table |
Date: | 2023-03-17 10:21:23 |
Message-ID: | 90741398-62ca-3bd5-79d0-c0f6a2eef4cf@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Wim Bertels schrieb am 17.03.2023 um 11:05:
> what would be the general idea: "easily" convert an hierarchical
> structure like json or xml to a table; for example creating columns by
> appending the key-names when going doing down the three, using null for
> empty values, adding more columns as needed by the given structure.
> (1-way operation)
>
> a few conceptual gists:
> jsonX=
> {
> "glossary": {
> "title": "example glossary",
> "GlossDiv": {
> "title": "S",
> "GlossList": {
> "GlossEntry": {
> "ID": "SGML",
> "SortAs": "SGML",
> "GlossTerm": "Sta..";
> "Acronym": "SGML",
> "Abbrev": "ISO 8879:1986",
> "GlossDef": {
> "para": "A meta-m..",
> "GlossSeeAlso":
> ["GML", "XML"]
> },
> "GlossSee": "markup"
> }
> }
> }
> }
> }
>
> select *
> from json_to_table('jsonX');
> -- generated columns with no data/only nulls could be removed..
> -- arrays could be unnested in the process as well
>
> glossary | glossary.title | glossary.title.GlossDiv.title | ..
> -------------------------------------------------------------
> null | example glossary | S |
> ..
>
> the last column: glossary.title.GlossDiv.GlossList.GlossEntry.GlossSee
> with value "markup"
>
> ---
>
> what if there are different structures that need to be combined?
> (they could be added in the same manner as before)
>
> jsonY=
> {
> s1:[{
> "f1": "a",
> "f2": "b",
> "f3": { "f3.1": "c",
> "f3.2": "d"}
> },
> {
> "f1": "e",
> "f4": "g"
> }
> ]
> }
>
> select *
> from json_to_table('jsonY');
> -- generated columns with no data/only nulls could be removed..
> -- separator sign is untrusted
>
> s1 | s1.f1 | s1.f2 | s1.f3 | s1.f3.f3.1 | s1.f3.f3.2 | s1.f4
> -------------------------------------------------------------
> null| a | b | null | c | d | null
> null| e | null | null | null | null | g
You can't have a function that returns a different set of columns each time you call it
(without specifying the output columns - which you don't want).
I have once written a function to flatten a JSON hierarchy to multiple rows.
Applied to your first example it would return the following:
path | key | value
-------------------------------------------------+--------------+-----------------
/glossary | title | example glossary
/glossary/GlossDiv | title | S
/glossary/GlossDiv/GlossList/GlossEntry | ID | SGML
/glossary/GlossDiv/GlossList/GlossEntry | Abbrev | ISO 8879:1986
/glossary/GlossDiv/GlossList/GlossEntry | SortAs | SGML
/glossary/GlossDiv/GlossList/GlossEntry | Acronym | SGML
/glossary/GlossDiv/GlossList/GlossEntry | GlossSee | markup
/glossary/GlossDiv/GlossList/GlossEntry | GlossTerm | Sta..
/glossary/GlossDiv/GlossList/GlossEntry/GlossDef | para | A meta-m..
/glossary/GlossDiv/GlossList/GlossEntry/GlossDef | GlossSeeAlso | ["GML", "XML"]
And the following for the second example:
path | key | value
-------+------+------
/s1 | f1 | a
/s1 | f2 | b
/s1/f3 | f3.1 | c
/s1/f3 | f3.2 | d
/s1 | f1 | e
/s1 | f4 | g
Thomas
Attachment | Content-Type | Size |
---|---|---|
flatten_json.sql | text/plain | 1.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2023-03-17 10:26:23 | Re: src/test/examples/testlibpq2.c where the HAVE_SYS_SELECT_H is defined. |
Previous Message | Wim Bertels | 2023-03-17 10:05:18 | nested xml/json to table |