![]() |
Is crawling the stack "bad"? Why?
I've got a case where I would like to know exactly what IP address a
client made an RPC request from. This info needs to be known inside the RPC function. I also want to make sure that the IP address obtained is definitely the correct one for the client being served by the immediate function call. That is kind of dumb/obvious to say, but I do just to highlight that it could be a problem for an RPC server allowing multiple simultaneous connections on multiple threads. ie: I can't set some simple "current_peer_info" variable when the connection is made and let the RPC function grab that value later since by the time it does it could easily be wrong. In order to solve this I toyed with a few schemes, but have (so far) settled on crawling up the stack from within the RPC call to a point where the precise connection info that triggered the RPC call to run could be determined. This makes sure (I think!) that I get the exact connection info in the event of a lot of simultaneous executions on different threads. It seems hackish, though. I frequently find that I take the long way around to do something only to find out later that there is a nice and tight pythonic way to get it done. This seems like it might be one of those cases and the back of my mind keeps trying to relegate this into the realm of cheat code that will cause me major pain later. I can't stop thinking the old days and slapping gotos all over code to fix something "quickly" rather than restructuring properly. Crawling around the stack in non-debugger code always seems nasty to me, but it sure seems to work nicely in this case... To illustrate this scheme I've got a short program using SimpleXMLRPCServer to do it. The code is below. If you run it you should get an output something like: RPC call came in on: ('127.0.0.1', 42264) Does anyone have a better way of doing this? Anyone want to warn me off of crawling the stack to get this type of info? The docstring for sys._getframe already warns me off by saying "This function should be used for internal and specialized purposes only", but without providing any convincing argument why that is the case. I'd love to hear a reasonable argument... the only thing I can think of is that it starts dipping into lower level language behavior and might cause problems if your aren't careful. Which is almost as vague as "for internal and specialized purposes only". I'm very curious to hear what you python wizards have to say. ---- import SimpleXMLRPCServer, xmlrpclib, threading, sys def GetCallerNameAndArgs(StackDepth = 1): """This function returns a tuple (a,b) where: a = The name of the calling function b = A dictionary with the arg values in order """ f = sys._getframe(StackDepth + 1) #+1 to account for this call callerName = f.f_code.co_name #get the arg count for the frame... argCount = f.f_code.co_argcount #get a tuple with the local vars in the frame (puts the args first)... localVars = f.f_code.co_varnames #now get the tuple of just the args... argNames = localVars[:argCount] #now to make a dictionary of args and values... argDict = {} for key in argNames: argDict[key] = f.f_locals[key] return (callerName, argDict) def GetRpcClientConnectionInfo(): #Move up the stack to the right point to figure out client info... requestHandler = GetCallerNameAndArgs(4)[1]["self"] usedSocket = requestHandler.connection return str(usedSocket.getpeername()) def StartSession(): return "RPC call came in on: %s" % GetRpcClientConnectionInfo() class DaemonicServerLaunchThread(threading.Thread): def __init__(self, RpcServer, **kwargs): threading.Thread.__init__(self, **kwargs) self.setDaemon(1) self.server = RpcServer def run(self): self.server.serve_forever() rpcServer = SimpleXMLRPCServer.SimpleXMLRPCServer(("", 12390), \ logRequests = False) rpcServer.register_function(StartSession) slt = DaemonicServerLaunchThread(rpcServer) slt.start() sp = xmlrpclib.ServerProxy("http://localhost:12390") print sp.StartSession() |
Re: Is crawling the stack "bad"? Why?
Argh... the code wrapped... I thought I made it narrow enough. Here
is the same code (sorry), but now actually pasteable. --- import SimpleXMLRPCServer, xmlrpclib, threading, sys def GetCallerNameAndArgs(StackDepth = 1): """This function returns a tuple (a,b) where: a = The name of the calling function b = A dictionary with the arg values in order """ f = sys._getframe(StackDepth + 1) #+1 to account for this call callerName = f.f_code.co_name #get the arg count for the frame... argCount = f.f_code.co_argcount #get a tuple with the local vars in the frame (args first)... localVars = f.f_code.co_varnames #now get the tuple of just the args... argNames = localVars[:argCount] #now to make a dictionary of args and values... argDict = {} for key in argNames: argDict[key] = f.f_locals[key] return (callerName, argDict) def GetRpcClientConnectionInfo(): #Move up the stack to the location to figure out client info... requestHandler = GetCallerNameAndArgs(4)[1]["self"] usedSocket = requestHandler.connection return str(usedSocket.getpeername()) def StartSession(): return "RPC call came in on: %s" % GetRpcClientConnectionInfo() class DaemonicServerLaunchThread(threading.Thread): def __init__(self, RpcServer, **kwargs): threading.Thread.__init__(self, **kwargs) self.setDaemon(1) self.server = RpcServer def run(self): self.server.serve_forever() rpcServer = SimpleXMLRPCServer.SimpleXMLRPCServer(("", 12390), \ logRequests = False) rpcServer.register_function(StartSession) slt = DaemonicServerLaunchThread(rpcServer) slt.start() sp = xmlrpclib.ServerProxy("http://localhost:12390") print sp.StartSession() |
Re: Is crawling the stack "bad"? Why?
That is just madness. The incoming ip address is available to the
request handler, see the SocketServer docs. Write a request handler that stashes that info somewhere that rpc responders can access it in a sane way. |
Re: Is crawling the stack "bad"? Why?
> That is just madness.
What specifically makes it madness? Is it because sys._frame is "for internal and specialized purposes only"? :) > The incoming ip address is available to the request handler, see the > SocketServer docs I know... that is exactly where I get the address, just in a mad way. > Write a request handler that stashes that info somewhere that rpc > responders can access it in a sane way. That is exactly where I started (creating my own request handler, snagging the IP address and stashing it), but I couldn't come up with a stash location that would work for a threaded server. This is the problem I was talking about with the "current_peer_info" scheme. How is the RPC responder function supposed to know what is the right stash, given that when threaded there could be multiple stashes at a time? The IP needs to get down to the exact function execution that is responding to the client... how do I do that? I had my options as: 1) stash the IP address somewhere where the RPC function could get it 2) pass the IP down the dispatch chain to be sure it gets to the target I couldn't come up with a way to get 1) to work. Then, trying to accomplish 2) I reluctantly started messing with different schemes involving my own versions of do_POST, _marshaled_dispatch, and _dispatch in order to pass the IP directly down the stack. After some pain at this (those dispatches are weird) I decided it was waaaay too much of a hack. Then I thought "why not go up the stack to fetch it rather than trying to mess with the nice/weird dispatch chain to send it down". I now had a third option... 3) Go up the stack to fetch the exact IP for the thread After realizing this I had my working stack crawl code only a few minutes later (I had GetCallerNameAndArgs already). Up the stack has a clear path. Down was murky and involved trampling on code I didn't want to override. The results is much cleaner than what I was doing and it worked, albeit with the as yet unfounded "crawling the stack is bad" fear still there. I should also point out that I'm not tied to SimpleXMLRPCServer, it is just a convenient example. I think any RPC protocol and dispatcher scheme would have the same problem. I'd be happy to hear about a clean stashing scheme (or any other alternative) that works for a threaded server. My biggest specific fear at the moment is that sys._frame will do funky things with multiple threads, but given that my toy example is executing in a server on its own thread and it traces perfectly I'm less worried. Come to think of it, I wonder what happens when you crawl up to and past thread creation? Hmm. |
Re: Is crawling the stack "bad"? Why?
Russell Warren <russandheather@gmail.com> writes:
> That is exactly where I started (creating my own request handler, > snagging the IP address and stashing it), but I couldn't come up with > a stash location that would work for a threaded server. How about a dictionary indexed by by the thread name. It's pretty lame, though, that the rpc server module itself doesn't make the request available to the rpc responder. Maybe you should submit a patch. > My biggest specific fear at the moment is that sys._frame will do > funky things with multiple threads, You should not rely on anything that implementation specific at all. What happens if you want to switch to pypy? |
Re: Is crawling the stack "bad"? Why?
Paul Rubin wrote:
> Russell Warren <russandheather@gmail.com> writes: >> That is exactly where I started (creating my own request handler, >> snagging the IP address and stashing it), but I couldn't come up with >> a stash location that would work for a threaded server. > > How about a dictionary indexed by by the thread name. the threading.local class seems defined for that purpose, not that I've ever used it ;) BB |
Re: Is crawling the stack "bad"? Why?
> How about a dictionary indexed by by the thread name.
Ok... a functional implementation doing precisely that is at the bottom of this (using thread.get_ident), but making it possible to hand around this info cleanly seems a bit convoluted. Have I made it more complicated than I need to? There must be a better way? It sure is a heck of a lot less straightforward than having a reasonably tight CrawlUpStackToGetClientIP function call. But then nothing is more straightforward than a simple goto, either... So I ask again, what is wrong with crawling the stack? > What happens if you want to switch to pypy? If it doesn't work if I decide to switch implementations for some reason, I just fix it when my unit tests tell me it is busted. No? Aren't there also python implementations that don't have threadign in them that would file using thread.get_ident? Seems hard to satisfy all implementations. > the threading.local class seems defined for that purpose, not that I've ever > used it ;) I hadn't heard of that... it seems very useful, but in this case I think it just saves me the trouble of making a stash dictionary... unless successive calls to threading.local return the same instance? I'll have to try that, too. --- import xmlrpclib, threading, sys, thread from SimpleXMLRPCServer import SimpleXMLRPCServer, \ SimpleXMLRPCRequestHandler class RpcContainer(object): def __init__(self): self._Handlers = {} #keys = thread IDs, values=requestHandlers def _GetRpcClientIP(self): connection = self._Handlers[thread.get_ident()].connection ip = connection.getpeername()[0] return ip def WhatIsMyIP(self): return "Your IP is: %s" % self._GetRpcClientIP() class ThreadCapableRequestHandler(SimpleXMLRPCRequestHan dler): def do_POST(self, *args, **kwargs): #make the handler available to the RPCs, indexed by threadID... self.server.RpcContainer._Handlers[thread.get_ident()] = self SimpleXMLRPCRequestHandler.do_POST(self, *args, **kwargs) class MyXMLRPCServer(SimpleXMLRPCServer): def __init__(self, RpcContainer, *args, **kwargs): self.RpcContainer = RpcContainer SimpleXMLRPCServer.__init__(self, *args, **kwargs) class DaemonicServerLaunchThread(threading.Thread): def __init__(self, RpcServer, **kwargs): threading.Thread.__init__(self, **kwargs) self.setDaemon(1) self.server = RpcServer def run(self): self.server.serve_forever() container = RpcContainer() rpcServer = MyXMLRPCServer( \ RpcContainer = container, addr = ("", 12390), requestHandler = ThreadCapableRequestHandler, logRequests = False) rpcServer.register_function(container.WhatIsMyIP) slt = DaemonicServerLaunchThread(rpcServer) slt.start() sp = xmlrpclib.ServerProxy("http://localhost:12390") print sp.WhatIsMyIP() |
Re: Is crawling the stack "bad"? Why?
On 2008-02-25, Russell Warren <russandheather@gmail.com> wrote:
> >> the threading.local class seems defined for that purpose, not that I've ever >> used it ;) > > I hadn't heard of that... it seems very useful, but in this case I > think it just saves me the trouble of making a stash dictionary... > unless successive calls to threading.local return the same instance? > I'll have to try that, too. No, successive calls to threading.local() will return different objects. So, you call it once to get your 'data store' and then use that one object from all your threads. It takes care of making sure each thread gets it's own data. Here is your example, but using threading.local instead of your own version of it. :) Ian import xmlrpclib, threading, sys, thread from SimpleXMLRPCServer import SimpleXMLRPCServer, SimpleXMLRPCRequestHandler thread_data = threading.local() class RpcContainer(object): def __init__(self): self._Handlers = {} #keys = thread IDs, values=requestHandlers def _GetRpcClientIP(self): #connection = self._Handlers[thread.get_ident()].connection connection = thread_data.request.connection ip = connection.getpeername()[0] return ip def WhatIsMyIP(self): return "Your IP is: %s" % self._GetRpcClientIP() class ThreadCapableRequestHandler(SimpleXMLRPCRequestHan dler): def do_POST(self, *args, **kwargs): #make the handler available to the RPCs, indexed by threadID... thread_data.request = self SimpleXMLRPCRequestHandler.do_POST(self, *args, **kwargs) class MyXMLRPCServer(SimpleXMLRPCServer): def __init__(self, RpcContainer, *args, **kwargs): self.RpcContainer = RpcContainer SimpleXMLRPCServer.__init__(self, *args, **kwargs) class DaemonicServerLaunchThread(threading.Thread): def __init__(self, RpcServer, **kwargs): threading.Thread.__init__(self, **kwargs) self.setDaemon(1) self.server = RpcServer def run(self): self.server.serve_forever() container = RpcContainer() rpcServer = MyXMLRPCServer( RpcContainer = container, addr = ("", 12390), requestHandler = ThreadCapableRequestHandler, logRequests = False) rpcServer.register_function(container.WhatIsMyIP) slt = DaemonicServerLaunchThread(rpcServer) slt.start() sp = xmlrpclib.ServerProxy("http://localhost:12390") print sp.WhatIsMyIP() |
Re: Is crawling the stack "bad"? Why?
Thanks Ian... I didn't know about threading.local before but have been
experimenting and it will likely come in quite handy in the future. For this particular case it does basically seem like a replacement for the threadID indexed dictionary, though. ie: I'll still need to set up the RpcContainer, custom request handler, and custom server in order to get the info handed around properly. I will likely go with this approach since it lets me customize other aspects at the same time, but for client IP determination alone I still half think that the stack crawler is cleaner. No convincing argument yet on why crawling the stack is considered bad? I kind of hoped to come out of this with a convincing argument that would stick with me... On Feb 25, 12:30 pm, Ian Clark <icl...@mail.ewu.edu> wrote: > On 2008-02-25, Russell Warren <russandheat...@gmail.com> wrote: > > > > >> the threading.local class seems defined for that purpose, not that I've ever > >> used it ;) > > > I hadn't heard of that... it seems very useful, but in this case I > > think it just saves me the trouble of making a stash dictionary... > > unless successive calls to threading.local return the same instance? > > I'll have to try that, too. > > No, successive calls to threading.local() will return different objects. > So, you call it once to get your 'data store' and then use that one > object from all your threads. It takes care of making sure each thread > gets it's own data. > > Here is your example, but using threading.local instead of your own > version of it. :) > > Ian > > import xmlrpclib, threading, sys, thread > from SimpleXMLRPCServer import SimpleXMLRPCServer, SimpleXMLRPCRequestHandler > > thread_data = threading.local() > > class RpcContainer(object): > def __init__(self): > self._Handlers = {} #keys = thread IDs, values=requestHandlers > def _GetRpcClientIP(self): > #connection = self._Handlers[thread.get_ident()].connection > connection = thread_data.request.connection > ip = connection.getpeername()[0] > return ip > def WhatIsMyIP(self): > return "Your IP is: %s" % self._GetRpcClientIP() > > class ThreadCapableRequestHandler(SimpleXMLRPCRequestHan dler): > def do_POST(self, *args, **kwargs): > #make the handler available to the RPCs, indexed by threadID... > thread_data.request = self > SimpleXMLRPCRequestHandler.do_POST(self, *args, **kwargs) > > class MyXMLRPCServer(SimpleXMLRPCServer): > def __init__(self, RpcContainer, *args, **kwargs): > self.RpcContainer = RpcContainer > SimpleXMLRPCServer.__init__(self, *args, **kwargs) > > class DaemonicServerLaunchThread(threading.Thread): > def __init__(self, RpcServer, **kwargs): > threading.Thread.__init__(self, **kwargs) > self.setDaemon(1) > self.server = RpcServer > def run(self): > self.server.serve_forever() > > container = RpcContainer() > rpcServer = MyXMLRPCServer( > RpcContainer = container, > addr = ("", 12390), > requestHandler = ThreadCapableRequestHandler, > logRequests = False) > rpcServer.register_function(container.WhatIsMyIP) > slt = DaemonicServerLaunchThread(rpcServer) > slt.start() > > sp = xmlrpclib.ServerProxy("http://localhost:12390") > print sp.WhatIsMyIP() |
Re: Is crawling the stack "bad"? Why?
Russell Warren wrote:
> Thanks Ian... I didn't know about threading.local before but have been > experimenting and it will likely come in quite handy in the future. > For this particular case it does basically seem like a replacement for > the threadID indexed dictionary, though. ie: I'll still need to set > up the RpcContainer, custom request handler, and custom server in > order to get the info handed around properly. I will likely go with > this approach since it lets me customize other aspects at the same > time, but for client IP determination alone I still half think that > the stack crawler is cleaner. > > No convincing argument yet on why crawling the stack is considered > bad? I kind of hoped to come out of this with a convincing argument > that would stick with me... > OK, if you crawl the stack I will seek you out and hit you with a big stick. Does that affect your decision-making? Seriously, crawling the stack introduces the potential for disaster in your program, since there is no guarantee that the calling code will provide the same environment i future released. So at best you tie your solution to a particular version of a particular implementation of Python. You might as well not bother passing arguments to functions at all, since the functions could always crawl the stack for the arguments they need.A major problem with this is that it constrains the caller to use particular names for particular function arguments. What happens if two different functions need arguments of the same name? Seriously, forget this craziness. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ |
| All times are GMT. The time now is 02:46 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.