Re: Migrated from 8.3 to 9.0 - need to update config (re-post)

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-performance(at)postgresql(dot)org>,<stonec(dot)register(at)sympatico(dot)ca>
Subject: Re: Migrated from 8.3 to 9.0 - need to update config (re-post)
Date: 2011-09-13 21:13:00
Message-ID: 4E6F810C0200002500041144@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Carlo Stonebanks wrote:

>> max_connections = 300
> Too high. Both throughput and latency should improve with correct
> use of a connection pooler.

> Even for 300 stateful applications that can remain connected for
> up to a week, continuously distilling data (imports)?

Absolutely.

A good connection pooler will be able to hold those 300 *client*
connections, and maintain a much smaller set of connections to the
database. It will notice when a client connection is requesting the
start of a database transaction. If there is an idle database
connection it will route the requests there; otherwise it will put
that client connection in a queue. When a database transaction is
committed, a waiting client connection (if any) will be assigned to
its database connection.

Every benchmark I've seen shows that this will improve both
throughput and latency over the approach of releasing a "thundering
herd" of requests against the server. Picture a meat counter with
four butchers behind it, and few spinning devices to slice meat.
If customers queue up, and the butchers call on people as they are
ready, things go better than if each butcher tries to take on one-
fourth of the customers at a time and constantly switch between one
order and another to try to make incremental progress on all of
them.

> a sys admin raised it from 100 when multiple large projects were
> loaded and the server refused the additional connections.

Whoever is making these decisions needs more training. I suggest
Greg Smith's book:

http://www.postgresql.org/docs/books/

(Full disclosure, I was a technical reviewer of the book and got a
free copy.)

> you want the controller configured for write-back (with automatic
> switch to write-through on low or failed battery, if possible).

For performance or safety reasons?

You get better performance with write-back. If you can't rely on
the battery, then write-back is not safe and you need to use write-
through.

> Since the sys admin thinks there's no performance benefit from
> this, I would like to be clear on why we should do this.

If you can get him to change it back and forth for performance
testing, it is easy enough to prove. Write a client application
which inserts on row per database transaction. A nice, simple,
short row -- like containing one integer column with no indexes.
Have the external application create the table and do a million
inserts. Try this with both cache settings. It's best not to
issue a BEGIN and COMMIT at all. Don't loop in a function or a DO
block, because that creates an implicit transaction.

> Every now and then the imports behave as if they are suddenly
> taking a deep breath, slowing down. Sometimes, so much we cancel
> the import and restart (the imports pick up where they left off).
>
> What would the bg_writer settings be in this case?

I'm not sure what that is based on information so far, so it's
unclear whether background writer settings would help; but on the
face of it my bet would be that it's a context switching storm or
swapping, and the connection pool would be the better solution.
Those poor butchers are just overwhelmed....

-Kevin

Browse pgsql-performance by date

  From Date Subject
Next Message Stefan Keller 2011-09-13 23:04:27 Hash index use presently(?) discouraged since 2005: revive or bury it?
Previous Message Kevin Grittner 2011-09-13 20:34:02 Re: Postgres for a "data warehouse", 5-10 TB