Re: Connection pooling - Number of connections

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Brett Wooldridge <brett(dot)wooldridge(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Connection pooling - Number of connections
Date: 2014-03-29 21:20:37
Message-ID: CAMkU=1wdmeQDvVSgzaBYCo-no0Ox7TqizyrOdjg5Co0ZvHRHAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Mar 24, 2014 at 6:27 AM, Brett Wooldridge <
brett(dot)wooldridge(at)gmail(dot)com> wrote:

> Hi, Brett Wooldridge here, one of the principals of HikariCP. I thought
> I'd wade into the conversation pool a little myself if you guys don't mind.
>
> Speaking to David's point...
> >> Reaching the maxPoolSize from the minPoolSize means creating the
> >> connections at the crucial moment where the client application is in
> the
> >> desperate need of completing an important query/transaction which the
> >> primary responsibility since it cannot hold the data collected.
>
> This was one of the reasons I was proposing the fixed pool design. In my
> experience, even in pools that maintain a minimum number of idle
> connections, responding to spike demands is problematic. If you have a
> pool with say 30 max. connections, and a 10 minimum idle connection goal, a
> sudden spike demand for 20 connections means the pool can satisfy 10
> instantly but then is left to [try to] establish 10 connections before the
> application's connectionTimeout (read acquisition timeout from the pool) is
> reached. This in turn generates a spike demand on the database slowing
> down not only the connection establishments themselves but also slowing
> down the completion of transactions that might actually return connections
> to the pool.
>

Do you have publishable test scripts to demonstrate this? It would be nice
for people to be able to try it out for themselves, on their own hardware,
to see what it does. I have seen some database products for which I would
not doubt this effect is real, having seen prodigiously long connection set
up times. But I'm skeptical that that is a meaningful problem for
PostgreSQL, at least not with the md5 authentication method.

> As I think Tom noted is a slidestack I read somewhere, there is a "knee"
> in the performance curve beyond which additional connections cause a drop
> in TPS. While users often think it is a good idea to have maxPoolSize of
> 100, the reality is they can retire/reuse connections faster with a much
> smaller pool. I didn't see a pool of a 2 or 3 dozen connections actually
> impacting performance much when half of them are idle and half are
> executing transactions (ie. the idle ones don't impact the overall
> performance much).
>

I think the knee applies mostly to active connections. I've seen no
indication of completely idle connections causing observable problems in
recent releases, until the number of them gets absurd.

And the location of the knee for the number of active connections is going
to depend greatly on the hardware and the work load.

> Finally, one of my contentions was, either your database server has
> resources or it doesn't. Either it has enough memory and processing power
> for N connections or it doesn't. If the pool is set below, near, or at
> that capacity what is the purpose of releasing connections in that case?
> Yes, it frees up memory, but that memory is not really available for other
> use given that at any instant the maximum capacity of the pool may be
> demanded.
> Instead releasing resources only to try to reallocate them during a demand
> peak seems counter-productive.
>

I pretty much agree with you on that. Most of the arguments I see for
getting rid of idle connections really seem to be arguments for lowering
the maximum number.

That said, I don't like it when people take away my options because they
think I might not be able to set them correctly. Just say the default value
for the min is the same as the max, and most people don't need to change
that. If you don't want to implement a feature, that is one thing, but to
take out a feature that already exists just because you think I'm not quite
clever enough to use it is quite another thing.

Also, often people don't have a realistic work-load generator with which to
test the maximum, or a budget to run such tests. They will have no idea
where to set it. A plausible thing to do in that case is to just set it
rather high and hope the actual usage never get high enough to test whether
the chosen value was correct. That would be more problematic if the pool
size is static.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Brett Wooldridge 2014-03-30 03:41:30 Re: Connection pooling - Number of connections
Previous Message Dave Johansen 2014-03-27 20:42:39 Partitions and prepared statements?