![]() |
|
|
|||||||
![]() |
Python - Re: database connection pooling from mod_python app |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
Posts: n/a
|
Ian Bicking wrote:
> On Wednesday, October 8, 2003, at 11:23 AM, Anthony_Barker wrote: > >> I have been searching around for database connection pooling for a >> mod_python app. > > > My understanding is that you don't really need connection pooling in > mod_python -- you can simply put the connection in a global variable. In fact, connection pooling within an application hosted by a forking Apache server is redundant since only one connection will ever be required at a time. 1 process == 1 request == 1 connection. > Maybe with something like: > > def get_connection(): > global _conn > try: > return _conn > except NameError: > _conn = (make your connection) > return _conn > > > I don't know mod_python enough to know if there's other magic > incantations you need. You don't need "pooling", because each request > is served in its own process -- you simply want to reuse the > connection for future requests. If there's 10 processes, you'll need > 10 separate connections, but those 10 processes will over time serve > many requests so you can use those 10 connections over and over. Ian's system will work just fine (although I think a module-scope variable is nicer than a global) but there are a few things to be aware of as detailed below. First of all I would like to define a couple of terms to try to avoid any confusion: "persistent connection" means a database connection which is opened once and never closed. "connection pool" means a managed collection of (possibly) open connections. A connection pool often has a maximum pool size and may wait for a period of time (i.e. block) for a connection to be closed rather than open a new collection if the pool size is exceeded. Connection objects from a pool are generally wrapped so that close() returns the connection to the pool rather than actually closing it. -- 1. Reused Transaction vs Connection Pool Leaks It is critical that a persistent connection's transaction is *always* completed - either with commit or rollback. If you forget to do this then database operations during the next request that the process handles will use the same transaction ... which could be disasterous. A connection pool will only give out a connection that is "closed" and, typically, rollback() is called by the pool manager as a connection is returned. This avoids the problem of transactions spanning multiple requests. A common problem when using connection pools are connection leaks - a connection is not closed and hence is never returned to the pool - but IMHO this is less serious. I would rather see an application hang waiting for a connection to become available than have corrupt data. The reused transaction and connection leak problems are both programming errors (use a finally block to avoid the problem) but it is easier to spot a connection pool leak - set the maximum pool size to a low number and hammer the application. You'll soon find out if there are any problems opens and closes connection and you can easily find offending code that forgets to call close(). 2. Apache forking As Ian mentions, Apache creates multiple processes to handle requests. A default installation of Apache 1.x on Debian has a MaxClients of 150, i.e. 150 processes. If each one of those processes maintains a persistent connection you can easily hit your database server's connection limit. 150 is a *lot* of connections to keep open anyway but many requests will not even need database access, i.e. images, CSS, JavaScript etc. Perhaps you've tuned your Apache config and set MaxClients to something more approriate, 30 for instance. Even then, it's likely that the number of simultaneous connections that are _required_ is much lower than 30, you might only need 5. 3. Multiple Applications Now let's say your web server hosts 2 applications and both applications use a different database. If persistent connections are used you've just doubled the maximum number of open connections! This scenario applies to an Apache instance with virtual hosting and I think it's the same when mod_python's PythonInterpreter configuration directive is used. -- I'm not saying that persistent connections are inherently bad, just that there a number of issues to be aware of with Apache. If you are using a forking Apache setup and you cannot afford to open new database connections on each request then persistent connections may be your only choice. However, if you need connection pooling to allow your application to scale effectively then you may be better off using an application server (Zope, Webware, Twisted etc) or something like SCGI (http://www.mems-exchange.org/software/scgi/). Another possibility, which I have not yet tried, is SQL Relay (http://sqlrelay.sourceforge.net). Sorry for the length of this post, I hope it helps. Cheers, Matt -- Matt Goodall, Pollenation Internet Ltd w: http://www.pollenation.net e: |
|
![]() |
| Thread Tools | Search this Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| MCITP: Database Developer and MCITP: Database Administrator | Robert | MCITP | 7 | 12-20-2008 07:59 AM |
| Loses Internet Connection | jmoore00 | General Help Related Topics | 0 | 09-24-2007 04:23 AM |
| How to configure VPN | hi5 | Hardware | 1 | 07-09-2007 11:21 AM |
| New DVD database website | dvdloc8.com | DVD Video | 2 | 10-19-2004 05:59 AM |
| connect a source unit with an S-Video connection to a receiving unit that has an RCA composite video connection ? worth it ? | OCZ Guy | DVD Video | 6 | 08-01-2004 05:44 PM |