Re: Escaping input from COPY

From: Roger Leigh <rleigh(at)codelibre(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Escaping input from COPY
Date: 2011-12-21 23:26:46
Message-ID: 20111221232646.GL5437@codelibre.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Dec 21, 2011 at 06:16:42PM -0500, Josh Kupershmidt wrote:
> On Tue, Dec 20, 2011 at 7:47 PM, Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com> wrote:
> > As far as I know you did not get an answer, which is not the same as there being
> > no answer:) I think you will find that the escaping is handled for you.
>
> I am rather dubious of the claim that "escaping is handled for you"
> with copy_from().
[...]
> This works because the strings have essentially been escaped by hand,
> and None turned into '\N'. So let's say you had the same data, without
> the escaping being done by hand, like this:
[...]
> But only because none of the rows happen to contain any characters
> which must be be escaped. How are you supposed to use copy_from() with
> arbitrary text, e.g.
>
> rows = [('Strange\t\tFirst\\Name', 'Last\nName', 100),
> ]
>
> because that sure doesn't seem to be handled automagically. Yes, I
> know I can write my own escaping code, but as Roger points out that's
> not ideal.

Yes, this is exactly the issue I have. Without being able to
handle the escaping of arbitrary data, it's too fragile to rely on.
At some point I'll get caught out by data containing escapes or
tabs, and it will all go horribly wrong.

I did see an example of using CSV format instead here:
http://www.perlmonks.org/?node_id=847265
and I'm sure there's a Pythonic equivalant which you could also use.
Since CSV has somewhat better-defined quoting rules (i.e. not
PostgreSQL-specific), I think I'll be trying this first. I'd be
happy to use the PostgreSQL format, but to avoid future breakage, I'd
need some way of determining what the server-side format is when
escaping.

If neither are sufficiently robust, I'll need to stick with
parameterised inserts, which handle arbitrary stuff without problems,
other than being slow.

Regards,
Roger

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2011-12-21 23:30:33 Re: Why vacumming performed on template1 with initdb command ?
Previous Message Misa Simic 2011-12-21 23:20:56 Re: [partition table] python fetchall or fetchone function can not get the returning rows