Re: speed concerns with executemany()

From: Christophe Pettus <xof(at)thebuild(dot)com>
To: mike bayer <mike_mp(at)zzzcomputing(dot)com>
Cc: "psycopg(at)postgresql(dot)org" <psycopg(at)postgresql(dot)org>
Subject: Re: speed concerns with executemany()
Date: 2016-12-24 00:09:36
Message-ID: 79BCF488-D1FB-42E1-B0E9-D4A54E7341A4@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

Are you running with the transaction isolation level set to ISOLATION_LEVEL_AUTOCOMMIT? If so, each of those INSERTs will be in its own transaction, and thus will go through the COMMIT overhead. That by itself wouldn't explain a jump that large (in most environments), but it will definitely be *much* slower.

> On Dec 23, 2016, at 16:05, mike bayer <mike_mp(at)zzzcomputing(dot)com> wrote:
>
> I'm getting more and more regular complaints among users of SQLAlchemy of the relatively slow speed of the cursor.executemany() call in psycopg2. In almost all cases, these users have discovered that Postgresql is entirely capable of running an INSERT or UPDATE of many values with a high degree of speed using a single statement with a form like this:
>
> INSERT INTO table (a, b, c)
> VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9), ...
>
> whereas if they run the same form using a single VALUES insert and sending the parameters as a sequence via executemany(), they will observe hyperbolically slow speeds - today, someone claims they can run approximately 200 sets of three integers each using the multiple VALUES approach in approximately .02 seconds, whereas running 200 values into a single executemany() call of the otherwise identical INSERT statement, they are claiming takes 20 seconds; that is, 100000% slower. I'm not really sure how that's even possible, considering the single INSERT with many VALUES is a much larger string to send over the network and be parsed by the server, if the overhead of a single INSERT is .02 seconds, we would think an executemany() of 200 INSERT statements each with a single parameter set would be at most four seconds.
>
> Obviously something is wrong with these users' environment, although I will note that the relative speed of psycopg2 executemany() over a 1G network is still pretty bad, compared to both sending a single INSERT with a large VALUES clause as well as compared to the executemany() speed of DBAPIs (even pure Python) for other databases like MySQL, I can provide some quick benchmarks if that's helpful.
>
> I understand that psycopg2 does not use prepared statements, and I have dim recollections that internal use of prepared statements for executemany is not on the roadmap for psycopg2. However, I'm still not sure what I should be telling my users when I get reports of these vastly slower results with executemany().
>
> I'm not asking that psycopg2 change anything, I'm just looking to understand what the heck is going on when people are reporting this. Should I:
>
> 1. tell them they have a network issue that is causing executemany() to have a problem? (even though I can also observe executemany() is kind of slow, though not as slow as these people are reporting)
>
> 2. tell them there's some known issue, vacuuming / indexes/ or something that is known to have this effect?
>
> 3. tell them that yes, they should use multiple-VALUES within a single INSERT (which would eventually lead to strong pressure on me to reinvent executemany() within SQLAlchemy's psycopg2 dialect to use this form) ? (also if so, why is this the case? can this claimed 100000% slowdown be real?)
>
> 4. use a hack to actually make my own prepared statements within executemany() (I vaguely recall some recipe that you can get a prepared statement going with psycopg2 by rolling it on the outside) ?
>
> 5. Other reasons that executemany() is known to sometimes be extremely slow?
>
> I'm purposely trying to stay out of the realm of picking apart the libpq internals, assuming psycopg2 devs can shed some light what's going on here. Thanks for any guidance you can offer!
>
>
>
>
>
> --
> Sent via psycopg mailing list (psycopg(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/psycopg

--
-- Christophe Pettus
xof(at)thebuild(dot)com

In response to

Responses

Browse psycopg by date

  From Date Subject
Next Message Daniele Varrazzo 2016-12-24 00:29:14 Re: speed concerns with executemany()
Previous Message mike bayer 2016-12-24 00:05:51 speed concerns with executemany()