Hierarchical Profile Module
From PyWiki
While nearing the final stages of my Python-Ogre based game (3d snake game on spheres), i decided it was time to do some profiling of my code. Not only because the game just felt too slow with regards to my system, but also to get some more knowledge about what i had written down over the past two years.
After trying the built-in Python modules 'profile' and 'cprofile', i decided they wouldn't help me much. IMO they show lots of information about internal Python stuff, but thereby clutter the view on the things i wanted too see, which is my own game flow.
Then i used this profiler for a few days: http://code.activestate.com/recipes/496938/
However because i wanted to be able to run profiles nested, and to be able to measure average times, i decided to brew my own... Of course, this method is intrusive; you have to modify your code, which is a bit of a downside.
Contents |
Assumption
For my own game i use the sample framework, thus eventually "..renderer\OGRE\sf_OIS.py". I have put my profile module at the same level to be able to see how my game code ('mainloop') compares to the full Ogre cycles. But more on that later.
Before you start
Store the Profile Module
I'd recommend to save the module's sources (see bottom of page) into "..Lib\site-packages\ogre\renderer\OGRE\Profiler.py", as stated earlier. However you are free to place it anywhere else, for example if you don't use the sample framework. The module declares an instance called "GProfiler". That's the one we need.
Use Psyco
Profiling a Python application without Psyco installed, is like profiling a C++ debug build; it's not so relevant. Averagely, the percentages in the results will probably match, but there's always the risk of spending time on the wrong (read: not really slow) pieces of code.
Usage
Assuming you have stored the file in the aforementioned location, use this line to access the profiler:
from ogre.renderer.OGRE.Profiler import GProfiler
Now, there's only three methods which you need: 'start(name)', 'stop(name)' and 'dump'. Start and stop should match, but you may nest your calls. It doesn't matter where you put your calls, here's some quick samples:
# Option 1: surround a function call GProfiler.start("Somefunction") SomeFunction() GProfiler.stop("Somefunction") # Option 2: time inside a function def SomeFunction(self): GProfiler.start("InsideSomeFunction") # Some logic here... GProfiler.stop("InsideSomeFunction") # Option 3: time some statement GProfiler.start("Pow") result = math.pow(2, 64) GProfiler.stop("Pow") # Print results GProfiler.dump()
Disable
To quickly disable the profiler, call GProfiler.setEnabled(False). However since this still invokes all calls to 'start' and 'stop', you'd best comment or remove all these calls before shipping your game.
Samples
Now show some samples with results printed. Let's start simple:
GProfiler.start("UpdateHero") # Game logic here... GProfiler.stop("UpdateHero") GProfiler.start("UpdateGUI") # Game logic here... GProfiler.stop("UpdateGUI") GProfiler.dump()
This will output the following:
Results: -UpdateHero (calls: 1, tm: 0.000000s, avg: 0.000000s, 50.00%) -UpdateGUI (calls: 1, tm: 0.000000s, avg: 0.000000s, 50.00%)
It shows that each 'slot' has 1 call, and zero time. When no time could be measured (code too fast) then the percentage will be calculated as 100% / slots. Also, both slots are at the same level. (Same indentation.)
Now a more real sample, for which i modified the Demo_Grass.py sample application. In order to get percentages relative to the entire Ogre cycle, i have wrapped the "startRendering" call in sf_OIS.py as follows:
# ... GProfiler.start("startRendering") self.root.startRendering() GProfiler.stop("startRendering") # ...
Followed by some start/stop calls inside the demo itself: LightWibbler::setValue, GrassListener::waveGrass and GrassListener::frameStarted. Running the sample for a while results in the following output:
Results: -startRendering (calls: 1, tm: 25.477000s, avg: 25.477000s, 100.00%) --frameStarted (calls: 580, tm: 2.674000s, avg: 0.004610s, 10.50%) ---waveGrass (calls: 580, tm: 2.583000s, avg: 0.004453s, 96.60%) --UpdateLightWibbler (calls: 580, tm: 0.047000s, avg: 0.000081s, 0.18%)
Note: the 'dump' call was placed at the end of the application:
application = Grass_Application() application.go() GProfiler.dump() # <- output
It shows that 10.5% of the time is spent in our framelistener (frameStarted), and 0.18% on our light. Thus Ogre had almost 90% to spend on rendering frames. This remainder value should be kept to a maximum in order to get highest frame rates. Of this 10.5%, 96.6% is spent on the waveGrass method. And so on...
More on results
Of course, since the profiler methods need to be called, and some administration needs to be done, all the time results include some overhead. However, since this overhead is always the same (at least for everything which you profile) the percentages will remain valid. Also, once you have found and fixed your bottle necks, you can remove the profiler lines again.
As mentioned below in TODO, the results are not sorted yet. However the most important ones are often those with highest percentages or high average call time. But this can heavily depend on the type of application which you are writing.
TODO
- The results are just printed as they got collected. There is no sorting (yet). Maybe i will add certain options later.
- Results are now printed to stdout. Add some nice file-writing later.
Full sources
Now, what you've all been waiting for :)
""" Provides hierarchical profiling of you Python(-Ogre) code. Free to use. Date: 18-08-2008 Author: scriptkid Contact: pkooman at gmail.com """ import time TRACING = False class TimedSlot: """ A possibly recursive structure to summarize ellapsed times between start/stop cycles for a names slot.""" def __init__(self, aName, aParent): self._name = aName # When we are stopped, measure at the parent again self._parent = aParent self._calls = 0 self._totalTime = 0 self._lastStartTime = 0 self._started = False self._children = [] # Possibly recurse def start(self): """ Marks the beginning of (more) timing on this slot.""" if self._started: print "'%s' already started; denied." %self._name else: #print "Started '%s'" %self._name self._started = True self._calls += 1 # Start now self._lastStartTime = time.time() def stop(self): """ Marks the end of (more) timing on this slot.""" if not self._started: print "'%s' was not started; denied." %self._name else: self._started = False # Increase the total time of this slot. This includes the ellapsed time of children. delta = time.time() - self._lastStartTime if TRACING: print "Add %f to '%s'" %(delta, self.getNamePath()) self._totalTime += delta #print "Stopped at: %f" %self._totalTime return self._parent def startChild(self, aName): """ Marks the beginning of additional timing of a (new) child slot. This happens when you call start within an earlier start. """ child = self._findChildByName(aName) if child: #print "restart child '%s'" %child.getName() child.start() else: child = TimedSlot(aName, self) # We are its parent if TRACING: print "Add profile slot for '%s'" %child.getName() child.start() self._children.append(child) return child def _findChildByName(self, aName): """ Looks up the given child by name.""" for child in self._children: if child.getName() == aName: return child return None # Member access def getName(self): """ What it sais """ return self._name def getNamePath(self): """ For tracing purposes.""" name = self._name parent = self._parent while parent: name = parent.getName() + "->" + name parent = parent.getParent() return name def isStarted(self): """ Returns whether this slot is currently being monitored.""" return self._started def getCalls(self): """ Returns the number of 'starts' on this slot.""" return self._calls; def getTotalTime(self): """ Returns the total time of all start/stop cycles on this slot.""" return self._totalTime def getAverageTime(self): """ What it sais """ return self._totalTime / max(self._calls, 1) def getPercentage(self): """ Our time / parent time, which is considered 100%""" if not self._parent: # No parent return 100 # Root elif self._parent.getTotalTime() == 0: return 100 / len(self._parent.getChildren()) else: return (self._totalTime / self._parent.getTotalTime()) * 100 def getParent(self): """ What it sais """ return self._parent def getChildren(self): """ What it sais """ return self._children class Profiler: """ A utility class for profiling execution time for code. Multile start/stops can be used to calculate an average and a total.""" def __init__(self): # Create a root, easier for the administration self._root = None self._currentSlot = None self._enabled = True def setEnabled(self, aValue): """ This is a quick way to disable the profiling. When your game is done, for max. performance you'd better comment or remove all calls to start/stop.""" self._enabled = aValue def start(self, aName): if not self._enabled: return if not self._currentSlot: # Very first call, create helper root item and start it self._root = TimedSlot("Root", None) # No parent # The first start/stop will count as 100% of the total time. self._root.start() self._currentSlot = self._root.startChild(aName) elif self._currentSlot.isStarted(): if self._currentSlot.getName() == aName: print "Slot '%s' already started; denied." %(aName) else: # Start nested child self._currentSlot = self._currentSlot.startChild(aName) else: # Restart a stopped slot self._currentSlot.start() def stop(self, aName): """ Stops the current time slot, and adds the difference with the most recent 'start' for this slot.""" if not self._enabled: return if self._currentSlot.getName() <> aName: print "Start/stop don't match (got '%s', expected '%s')" %(aName, self._currentSlot.getName()) else: # Stop and retrieve parent self._currentSlot = self._currentSlot.stop() def dump(self): if not self._enabled: return # Stop all while self._currentSlot: self._currentSlot = self._currentSlot.stop() print "Results:" self._printOutput(self._root, 0) def _printOutput(self, aSlot, anIndent): """ Recursively print some stats.""" if aSlot: if aSlot.getName() <> "Root": line = "%s%s (calls: %d, tm: %fs, avg: %fs, %3.2f%%)" %("-" * anIndent, aSlot.getName(), aSlot.getCalls(), aSlot.getTotalTime(), aSlot.getAverageTime(), aSlot.getPercentage()) print line # Recurse for child in aSlot.getChildren(): self._printOutput(child, anIndent+1) def reset(self): """ Prepare for a next run """ if not self._enabled: return self._root = None self._currentSlot = None # Global profiler GProfiler = Profiler()
