Re: Possible reasons for copy_from to never terminate?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Daniel(dot)Straight(at)Intralox(dot)com, psycopg(at)postgresql(dot)org
Subject: Re: Possible reasons for copy_from to never terminate?
Date: 2015-04-24 15:32:16
Message-ID: 553A6200.7060107@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

On 04/24/2015 07:57 AM, Daniel(dot)Straight(at)Intralox(dot)com wrote:
>> So how long does it 'usually' run?
>
> Usual is best measured in ms. The amount of data is on the order of 100-1000 lines of 10 or so columns.
>
>> Define client.
>> Are you talking about the Web client making the POST or the Flask app as client to Postgres?
>
> I'm talking about the web client making the POST request.
>
>> This would Postgres complaining about losing connection to presumably the code you show
>> in the Flask app. Though it would be nice to know where exactly the copy_from is in the Flask flow of things.
>
> Right, it's losing connection when I restart, but my concern is how long it was connected in the first place.
>
>> I would say a more complete code snippet would help. Something
>> encompassing how the POST data gets into the list_of_strings and then
>> into the copy_from()
>
> Sure thing. I have a Flask route which accepts POST requests with a UTF-8 plain text body.
> The body is tab-separated data, timestamps and numbers. I read the data line by line using
> a map over flask.request.stream, split on tabs, reformat timestamps with isoformat so
> Postgres will take them, re-join with tabs and then send it all to copy_from. Here's the code.
> convert_timestamps just takes known timestamp columns and replaces them with equivalent
> isoformat strings. get_db returns an open psycopg2 connection.
> All of this works without issue >99% of the time.
>
> def handle_post():
> lines = list(map(
> lambda line_data: '\t'.join(convert_timestamps(line_data)),
> map(
> lambda line: line.decode('utf-8').strip().split('\t'),
> flask.request.stream)))
> db = get_db()
> cursor = db.cursor()
> cursor.copy_from(io.StringIO('\n'.join(lines)), 'tablename')
> db.commit()
> cursor.close()
> return len(lines)

Not seeing anywhere in above where connection is closed. I do not know
Flask well enough to know if that is handled for you or not?

>
> So copy_from should be getting data which is entirely in memory. I don't see how it
> could never finish reading the data, which is what seems to be happening.

Maybe it is, but is not closing correctly. When I do this sort of thing
I explicitly close() the memory file.

>
> The incoming data is being sent over a slow connection, so sometimes it takes a while
> to send, making the total server response time over a second, but it seems like by the
> time copy_from is hit, it should definitely have all the data, so the slow connection
> doesn't seem like it should be an issue here.
>
> -Daniel
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse psycopg by date

  From Date Subject
Next Message Ludovic Gasc 2015-05-06 20:56:48 Sanitize schema name
Previous Message Daniel.Straight 2015-04-24 14:57:52 Re: Possible reasons for copy_from to never terminate?