Re: Performance Bottleneck

From: Martin Foster <martin(at)ethereal-realms(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance Bottleneck
Date: 2004-08-06 23:18:49
Message-ID: tlUQc.32079$cd2.29211@clgrps12
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Mike Benoit wrote:

> On Wed, 2004-08-04 at 17:25 +0200, Gaetano Mendola wrote:
>
>
>>>The queries themselves are simple, normally drawing information from one
>>>table with few conditions or in the most complex cases using joins on
>>>two table or sub queries. These behave very well and always have, the
>>>problem is that these queries take place in rather large amounts due to
>>>the dumb nature of the scripts themselves.
>>
>>Show us the explain analyze on that queries, how many rows the tables are
>>containing, the table schema could be also usefull.
>>
>
>
> If the queries themselves are optimized as much as they can be, and as
> you say, its just the sheer amount of similar queries hitting the
> database, you could try using prepared queries for ones that are most
> often executed to eliminate some of the overhead.
>
> I've had relatively good success with this in the past, and it doesn't
> take very much code modification.
>

One of the biggest problems is most probably related to the indexes.
Since the performance penalty of logging the information needed to see
which queries are used and which are not is a slight problem, then I
cannot really make use of it for now.

However, I am curious how one would go about preparing query? Is this
similar to the DBI::Prepare statement with placeholders and simply
changing the values passed on execute? Or is this something database
level such as a view et cetera?

SELECT
Post.PostIDNumber,
Post.$format,
Post.PuppeteerLogin,
Post.PuppetName,
Post.PostCmd,
Post.PostClass
FROM Post
WHERE Post.PostIDNumber > ?::INT
AND (Post.PostTo='all' OR Post.PostTo=?)
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin
FROM PuppetIgnore
WHERE PuppetIgnore.PuppetIgnore='global'
AND PuppetIgnore.PuppeteerLogin=?
AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin)
OR Post.PuppeteerLogin IS NULL)
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName
FROM PuppetIgnore
WHERE PuppetIgnore.PuppetIgnore='single'
AND PuppetIgnore.PuppeteerLogin=?
AND PuppetIgnore.PuppetName=Post.PuppetName)
OR Post.PuppetName IS NULL)
ORDER BY Post.PostIDNumber LIMIT 100

The range is determined from the previous run or through a query listed
below. It was determined that using INT was far faster then limiting
by timestamp.

SELECT MIN(PostIDNumber)
FROM Post
WHERE RealmName=?
AND PostClass IN ('general','play')
AND PostTo='all'

The above simply provides a starting point, nothing more. Once posts
are pulled the script will throw in the last pulled number as to start
from a fresh point.

Under MySQL time was an stored as an INT which may have helped it handle
timestamps more efficiently. It also made use of three or more
queries, where two were done to generate an IN statement for the query
actually running at the time.

Martin Foster
Creator/Designer Ethereal Realms
martin(at)ethereal-realms(dot)org

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gaetano Mendola 2004-08-06 23:24:18 Re: Performance Bottleneck
Previous Message Josh Berkus 2004-08-06 23:18:40 Re: The black art of postgresql.conf tweaking