Re: Caching connection pool?

From: Israel Brewster <israel(at)ravnalaska(dot)net>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
Cc: "psycopg(at)postgresql(dot)org" <psycopg(at)postgresql(dot)org>
Subject: Re: Caching connection pool?
Date: 2017-06-08 19:28:26
Message-ID: F5CA497E-04AC-4286-8F94-A9E72ACBAD95@ravnalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

On Jun 8, 2017, at 11:02 AM, Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com> wrote:
>
> On Thu, Jun 8, 2017 at 7:28 PM, Israel Brewster <israel(at)ravnalaska(dot)net> wrote:
>
>> To resolve this, I went ahead and wrote up a AbstractConnectionPool subclass
>> that implements the caching logic I laid out above
>> (https://github.com/psycopg/psycopg2/pull/565)
>
> Thank you for your work, Israel
>
> For the list: I've not been a heavy pool user for a long time, and the
> previous logic always felt a bit weak (destroy immediately a
> connection if > minconn?) So I welcome improvement on this object. I'd
> be happy to know from other people what they expect as a reasonable
> default behaviour for the connection pool.
>
>> At the moment I would
>> consider it to be a fairly basic implementation - pruning of expired
>> connections occurs only on a call to putconn (as opposed to running
>> periodically), and connections time out at a (user-specified) period of time
>> from when they were created, rather than last used, for example.
>
> For me it is desirable that the pool is "passive": a periodic run
> would require a separate thread, which then someone would like to be
> an eventlet green thread and some other a gevent green thread... In
> any case it wouldn't behave in a deterministic way.

Good point. I was thinking of the situation where you might have a "rush" of connections needed, and then nothing, whereupon all the connections opened in the rush would stay open indefinitely, but that's probably a corner case that's not worth worrying about - if someone is using a pool, chances are connections are being returned to it regularly, and so old connections would get cleaned up regularly. If not, oh well.

>
> Probably using the date of last usage would be better than the date of
> creation, otherwise you would just delete all the connections every 10
> minutes, or something like that.

Right. Incidentally, I have seen some things that do pretty much exactly that: I've seen some web servers, for example, that have a setting where a given process will only server up so many requests before it is killed and re-spawned. So similar thing, only time based. Agreed that date of last usage would probably be better here. Of course, that leads to the question of if there should be some logic, such as "least recently used", applied when getting a connection, but it probably doesn't really matter.

>
>> It's
>> arguable that both those elements (and perhaps others) could be improved on,
>> but in my initial testing it appears to work. What do you guys think? Is
>> this worth my time, or am I re-creating the wheel or engaging in premature
>> optimization or something? Is the logic as I laid it out above good, or
>> would there be some improvements that could be suggested for it? One thing I
>> was thinking that could be helpful would be that if it saw the pool of
>> available connections getting low, it could open one or more additional
>> connections "in the background" so they are ready when needed. But maybe
>> that's going a bit overboard?
>
> I think the pool provided by psycopg should still be a "mechanism"
> rather than a "policy". It would be great to improve on what we
> currently have, but not the pool too "intelligent": a super-smart pool
> would be probably tied and tweaked to the application using it, which
> may decide to use e.g. background threads to maintain the pool at a
> certain level, may periodically check the state of the connections,
> etc. But the pool is a weird object, it's a bit on the fence between
> being mechanism and being policy, so I welcome a discussion about what
> should be reasonable to expect from this object out of the psycopg
> box, what may instead belong to an extension module from PyPI, what is
> too closely tied to an appication/environment to be generically
> useful.

Exactly - I could probably come up with a bunch of "cool" stuff for the pool to do, like connecting to multiple servers or trying to dynamically predict the number of needed connections or something, but most of it probably wouldn't be appropriate for a "base" pool class. Discussion is good :)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------
>
> Cheers,
>
> -- Daniele

In response to

Browse psycopg by date

  From Date Subject
Next Message Rory Campbell-Lange 2017-06-11 10:43:52 Call plpgsql function with an array of custom type
Previous Message Daniele Varrazzo 2017-06-08 19:02:57 Re: Caching connection pool?