Re: Question about SQL performance

From: PFC <lists(at)peufeu(dot)com>
To: "Jason Lustig" <lustig(at)brandeis(dot)edu>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Question about SQL performance
Date: 2007-06-05 05:52:12
Message-ID: op.ttfo9ajtcigqcu@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> What sort of speed increase is there usually with binding parameters
> (and thus preparing statements) v. straight sql with interpolated
> variables? Will Postgresql realize that the following queries are
> effectively the same (and thus re-use the query plan) or will it think
> they are different?
>
> SELECT * FROM mytable WHERE item = 5;
> SELECT * FROM mytable WHERE item = 10;

No, if you send the above as text (not prepared) they are two different
queries.
Postgres' query executor is so fast that parsing and planning can take
longer than query execution sometimes. This is true of very simple selects
like above, or some very complex queries which take a long time to plan
but don't actually process a lot of rows.
I had this huge query (1 full page of SQL) with 5 joins, aggregates and
subqueries, returning about 30 rows ; it executed in about 5 ms, planning
and parsing time was significant...

> Obviously to me or you they could use the same plan. From what I
> understand (correct me if I'm wrong), if you use parameter binding -
> like "SELECT * FROM mytable WHERE item = ?" - Postgresql will know that
> the queries can re-use the query plan, but I don't know if the system
> will recognize this with above situation.

It depends if your client library is smart enough to prepare the
statements...

> Also, what's the difference between prepared statements (using PREPARE
> and EXECUTE) and regular functions (CREATE FUNCTION)? How do they impact
> performance? From what I understand there is no exact parallel to stored
> procedures (as in MS SQL or oracle, that are completely precompiled) in
> Postgresql. At the same time, the documentation (and other sites as
> well, probably because they don't know what they're talking about when
> it comes to databases) is vague because PL/pgSQL is often said to be
> able to write stored procedures but nowhere does it say that PL/pgSQL
> programs are precompiled.

PG stores the stored procedures as text. On first invocation, in each
connection, they are "compiled", ie. all statements in the SP are
prepared, so the first invocation in a connection is slower than next
invocations. This is a problem if you do not use persistent connections.

A simple select, when prepared, will take about 25 microseconds inside a
SP and 50-100 microseconds as a query over the network. If not prepared,
about 150 µs or 2-3x slower.

FYI Postgres beats MyISAM on "small simple selects" if you use prepared
queries.

I use the following Python code to auto-prepare my queries :

db = PGConn( a function that returns a DB connection )
db.prep_exec( "SELECT * FROM stuff WHERE id = %s", 1 ) # prepares and
executes
db.prep_exec( "SELECT * FROM stuff WHERE id = %s", 2 ) # executes only

class PGConn( object ):

def __init__( self, db_connector ):
self.db_connector = db_connector
self.reconnect()

def reconnect( self ):
self.prep_cache = {}
self.db = self.db_connector()
self.db.set_isolation_level( 0 ) # autocommit

def cursor( self ):
# return self.db.cursor( cursor_factory=psycopg2.extras.DictCursor )
return self.db.cursor( )

def execute( self, sql, *args ):
cursor = self.cursor()
try:
cursor.execute( sql, args )
except:
cursor.execute( "ROLLBACK" )
raise
return cursor

def executemany( self, sql, *args ):
cursor = self.cursor()
try:
cursor.executemany( sql, args )
except:
cursor.execute( "ROLLBACK" )
raise
return cursor

def prep_exec( self, sql, *args ):
cursor = self.cursor()
stmt = self.prep_cache.get( sql )
if stmt is None:
name = "stmt_%s" % (len( self.prep_cache ) + 1)
if args:
prep = sql % tuple( "$%d"%(x+1) for x in xrange( len( args )) )
else:
prep = sql
prep = "PREPARE %s AS %s" % (name, prep)
cursor.execute( prep )
if args:
stmt = "EXECUTE %s( %s )" % (name, ", ".join( ["%s"] * len( args ) ))
else:
stmt = "EXECUTE %s" % (name,)
self.prep_cache[ sql ] = stmt

try:
cursor.execute( stmt, args )
except Exception, e:
traceback.print_exc()
print "Error while executing prepared SQL statement :", stmt
print "Arguments :", args
print "Original SQL is :", sql
cursor.execute( "ROLLBACK" )
raise

return cursor

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2007-06-05 08:50:56 Re: Question about SQL performance
Previous Message mark 2007-06-05 05:23:12 Re: Question about SQL performance