Re: speed concerns with executemany()

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>, Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, mike bayer <mike_mp(at)zzzcomputing(dot)com>, "psycopg(at)postgresql(dot)org" <psycopg(at)postgresql(dot)org>
Subject: Re: speed concerns with executemany()
Date: 2016-12-30 23:55:32
Message-ID: c5457fe5-cefb-7c7b-9a33-6a7ff0b3658f@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

On 12/30/2016 02:24 PM, Daniele Varrazzo wrote:
> The implementation of executemany as described by me a few days ago is
> available in this gist, not heavily tested:
>
> https://gist.github.com/dvarrazzo/4204cca5d1cd7c9c95bc814d81e6b83e
>
> I would like to know if anyone sees any shortcoming in this new implementation.

A quick test. I added an argument to change the page_size on the command
line:

With NRECS=10000:

aklaver(at)tito:~> python psycopg_executemany.py -p 10
classic: 0.800544023514 sec
joined: 0.514330863953 sec
aklaver(at)tito:~> python psycopg_executemany.py -p 100
classic: 0.780461072922 sec
joined: 0.473304986954 sec
aklaver(at)tito:~> python psycopg_executemany.py -p 1000
classic: 0.820818901062 sec
joined: 0.488647937775 sec

With NRECS=100000:

aklaver(at)tito:~> python psycopg_executemany.py -p 10
classic: 7.78319811821 sec
joined: 4.18683385849 sec
aklaver(at)tito:~> python psycopg_executemany.py -p 100
classic: 7.75992202759 sec
joined: 4.06096816063 sec
aklaver(at)tito:~> python psycopg_executemany.py -p 1000
classic: 7.76269102097 sec
joined: 4.12301802635 sec

The relative difference between the classic and joined seems to hold,
you just do not seem to get much benefit from changing the page_size.
Not sure how much that matters and you do get a benefit from the joined
solution.

>
> -- Daniele
>
> On Sun, Dec 25, 2016 at 10:11 AM, Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com> wrote:
>> Sending stuff in big-batches + autocommit (fast transactions) + few network
>> calls is performance 101 I thought. I think the "executemany" should be
>> documented what it does (it looked suspicious when I saw it long time ago,
>> why I didn't use it).
>>
>> On Sat, Dec 24, 2016 at 6:00 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>> wrote:
>>>
>>> On 12/23/2016 06:57 PM, Christophe Pettus wrote:
>>>>
>>>>
>>>>> On Dec 23, 2016, at 18:55, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>>>>> wrote:
>>>>> Alright that I get. Still the practical outcome is each INSERT is being
>>>>> done in a transaction (an implicit one) so the transaction overhead comes
>>>>> into play. Or am I missing something?
>>>>
>>>>
>>>> Nope, not missing a thing. The theory (and it is only that) is that when
>>>> they do the .executemany(), each of those INSERTs pays the transaction
>>>> overhead, while if they do one big INSERT, just that one statement does.
>>>
>>>
>>> Just ran a quick and dirty test using IPython %timeit.
>>>
>>> With a list of 200 tuples each which had 3 integers INSERTing into:
>>> test=> \d psycopg_table
>>> Table "public.psycopg_table"
>>> Column | Type | Modifiers
>>> --------+---------+-----------
>>> a | integer |
>>> b | integer |
>>> c | integer |
>>>
>>>
>>> The results where:
>>>
>>> sql = "INSERT INTO psycopg_table VALUES(%s, %s, %s)"
>>>
>>> Without autocommit:
>>>
>>> In [65]: timeit -n 10 cur.executemany(sql, l)
>>> 10 loops, best of 3: 12.5 ms per loop
>>>
>>>
>>> With autocommit:
>>>
>>> In [72]: timeit -n 10 cur.executemany(sql, l)
>>> 10 loops, best of 3: 1.71 s per loop
>>>
>>>
>>>>
>>>> --
>>>> -- Christophe Pettus
>>>> xof(at)thebuild(dot)com
>>>>
>>>
>>>
>>> --
>>> Adrian Klaver
>>> adrian(dot)klaver(at)aklaver(dot)com
>>>
>>>
>>>
>>> --
>>> Sent via psycopg mailing list (psycopg(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/psycopg
>>
>>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse psycopg by date

  From Date Subject
Next Message Dorian Hoxha 2016-12-31 07:42:31 Re: speed concerns with executemany()
Previous Message Christophe Pettus 2016-12-30 23:42:08 Re: speed concerns with executemany()