From: | Alastair McKinley <a(dot)mckinley(at)analyticsengines(dot)com> |
---|---|
To: | "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | copying json data and backslashes |
Date: | 2022-11-22 14:23:16 |
Message-ID: | PAXPR02MB7600EDF4AA4898094EBD8B10E30D9@PAXPR02MB7600.eurprd02.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all,
I have come across this apparently common issue COPY-ing json and wondering if there is potentially a better solution.
I am copying data into a jsonb column originating from a 3rd party API. The data may have literal \r,\t,\n and also double backslashes.
I discovered that I can cast this data to a jsonb value directly but I can't COPY the data without pre-processing.
The example below illustrates my issue (only with \r, but the problem extends to other \X combinations).
do $$
lines=[r'{"test" : "\r this data has a carriage return"}']
with open("/tmp/test1.json","w") as f:
for line in lines:
f.write(line.strip() + "\n")
$$ language plpython3u;
create temp table testing (data jsonb);
-- this works
insert into testing (data)
select l::jsonb
from pg_read_file('/tmp/test1.json') f,
lateral regexp_split_to_table(f,'\n') l where l <> '';
-- fails
copy testing (data) from '/tmp/test1.json';
-- works
copy testing (data) from program $c$ sed -e 's/\\r/\\\\u000a/g' /tmp/test1.json $c$;
Is there any other solution with COPY that doesn't require manual implementation of search/replace to handle these edge cases?
Why does ::jsonb work but COPY doesn't? It seems a bit inconsistent.
Best regards,
Alastair
From | Date | Subject | |
---|---|---|---|
Next Message | Erik Wienhold | 2022-11-22 15:15:57 | Re: copying json data and backslashes |
Previous Message | Laurenz Albe | 2022-11-22 08:43:04 | Re: system variable can be edited by all user? |