Re: speed concerns with executemany()

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>, Federico Di Gregorio <fog(at)dndg(dot)it>
Cc: "psycopg(at)postgresql(dot)org" <psycopg(at)postgresql(dot)org>
Subject: Re: speed concerns with executemany()
Date: 2017-01-05 21:23:09
Message-ID: b08cf4b4-f588-e36e-0cd7-2543abc57809@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

On 01/05/2017 11:00 AM, Daniele Varrazzo wrote:
> On Thu, Jan 5, 2017 at 5:32 PM, Federico Di Gregorio <fog(at)dndg(dot)it> wrote:
>> On 02/01/17 17:07, Daniele Varrazzo wrote:
>>>
>>> On Mon, Jan 2, 2017 at 4:35 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>>> wrote:
>>>>
>>>> With NRECS=10000 and page size=100:
>>>>
>>>> aklaver(at)tito:~> python psycopg_executemany.py -p 100
>>>> classic: 427.618795156 sec
>>>> joined: 7.55754685402 sec
>>>
>>> Ugh! :D
>>
>>
>> That's great. Just a minor point: I won't overload executemany() with this
>> feature but add a new method UNLESS the semantics are exactly the same
>> especially regarding session isolation. Also, right now psycopg keeps track
>> of the number of affected rows over executemany() calls: I'd like to not
>> lose that because it is a breaking change to the API.
>
> It seems to me that the semantics would stay the same, even in
> presence of volatile functions. However unfortunately rowcount would
> break. That's just sad.
>
> We can have no problem an extra argument to executemany: page_size
> defaulting to 1 (previous behaviour) which could be bumped. It's sad
> the default cannot be 100.
>
> Mike Bayer reported (https://github.com/psycopg/psycopg2/issues/491)
> that SQLAlchemy actually uses the aggregated rowcount for concurrency
> control.
>
> So, how much it is of a deal-breaker? Can we afford losing aggregated
> rowcount to obtain a juicy speedup in default usage, or we'd rather
> leave the behaviour untouched but having people "opting in for speed"?
>
> ponder, ponder...
>
> Pondered: as the features had little test and I don't want to delay
> releasing 2.7 further, I'd rather release the feature with a page_size
> default of 1. People could use it and report eventual failures if they
> use a page_size > 1. If tests turn out to be positive that the
> database behaves ok we could think about changing the default in the
> future. We may want to drop the aggregated rowcount in the future but
> with better planning, e.g. to allow SQLAlchemy to ignore aggregated
> rowcount from psycopg >= 2.8...
>
> How does it sound?

Works for me.
>
> -- Daniele
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse psycopg by date

  From Date Subject
Next Message Federico Di Gregorio 2017-01-05 22:12:26 Re: speed concerns with executemany()
Previous Message Adrian Klaver 2017-01-05 21:22:40 Re: Solving the SQL composition problem