stats: Optimise the text-based stats output to improve speed
Review Request #1643 - Created Jan. 15, 2013 and updated
| Information | |
|---|---|
| Andreas Hansson | |
| gem5 | |
| default | |
| Reviewers | |
| Default | |
Changeset 9714:68ea5c8f8d6c
---------------------------
stats: Optimise the text-based stats output to improve speed
The text-based stats output is slow than the original C++ output. This patch
addresses some of those issues by removing:
Asserts that are not required
stats.ValueProxy, as it is no longer required
Unnecessary calls to __getattr__ and __hasattr__
It also changes the formula calculation to only calculate the values and totals
of formulas if they have not been passed in from the C++. This allows other
formulas to be added, whilst ensuring that the stats output is sufficiently
fast.
Testing has shown a 6x - 8x increase in stats output speed.
Ensure that the output files are identical
Issue Summary
| Description | From | Last Updated | Status |
|---|---|---|---|
| If the asserts really aren't needed, it's fine to remove them, but an alternative to removing asserts, is to bytecode ... | Nathan Binkert | Jan. 25, 2013, 1:48 a.m. | Open |
| I'm not objecting to their removal, since in this case, it is probably overkill. I'm just suggesting a general speedup ... | Nathan Binkert | Jan. 29, 2013, 2:20 a.m. | Open |
-
src/python/m5/stats/display.py (Diff revision 1) -
Why do we need to re-create noNan?
-
src/python/m5/stats/__init__.py (Diff revision 1) -
Since you are changing this function call, should not the function definition also change?
I could be wrong about the getattr thing, but this patch seems like it needs work.
-
src/python/m5/stats/__init__.py (Diff revision 1) -
seems like an unrelated fix.
-
src/python/m5/stats/display.py (Diff revision 1) -
I'm probably missing context, but this seems superfluous. You do the same thing below.
-
src/python/m5/stats/display.py (Diff revision 1) -
fyi sum(x for x in self.value if not isnan(x)) is faster than what you have because it does not create a temporary list. you could also just replace the square brackets with parens to turn the list into a generator expression.
-
src/python/m5/stats/display.py (Diff revision 1) -
Imho, it is very bad form to catch all exceptions here. You will for example, trap a KeyboardInterrupt exception which seems like a bad idea. Catch the actual exception(s) that you're worried about. To catch more is a bug.
-
src/python/m5/stats/display.py (Diff revision 1) -
I don't think that you're improving anything here at all. Are you sure you understand __getattr__?
-
src/python/m5/stats/info.py (Diff revision 1) -
If the asserts really aren't needed, it's fine to remove them, but an alternative to removing asserts, is to bytecode compile python optimized in m5.fast?
-
src/python/m5/stats/info.py (Diff revision 1) -
Poor use of __getattr__
-
src/python/m5/stats/info.py (Diff revision 1) -
I think that you're using __getattr__ wrong. __getattr__ is only called if the attribute is missing, so if you were to set _the_total instead of calculatedTotal, you wouldn't even need the initial check that you do here.
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 2 (+49 -40) |
-
src/python/m5/stats/info.py (Diff revision 1) -
I'm not objecting to their removal, since in this case, it is probably overkill. I'm just suggesting a general speedup plan for you guys. When you bytecode compile python as optimized code, all it does is remove the assert statements. So, in the build process where we compile all of the python objects (src/Sconscript, embedPyFile), we could compile optimized for m5.fast and that would remove all assertions. (All that said, it's non trivial since I think you need to run python with -O to be able to compile optimized modules.)
-
src/python/m5/stats/info.py (Diff revision 1) -
I never said I was perfect! Unfortunately, there is tons of my crappy code in here. Anyway, I wasn't talking about this specific line, but the new lines below with the hasattr. __getattr__ does more or less what you're doing there. Get rid of the hasattr stuff, and further below, where you set "self.calculatedValue", change it to "self._the_value" This will allow you to avoid unnecessary calls to __getattr__ (the way you've written it, you'll call it every time you try to get _the_value).
-
src/python/m5/stats/info.py (Diff revision 1) -
Same as above
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 3 (+46 -38) |
-
src/python/m5/stats/display.py (Diff revision 3) -
try/except may be faster here. I haven't benchmarked that though.
-
src/python/m5/stats/display.py (Diff revision 3) -
Don't catch all exceptions. This can swallow things like KeyboardInterrupt. Be explicit about catching exceptions. This seems like it should catch AttributeError and or IndexError only.
-
src/python/m5/stats/display.py (Diff revision 3) -
Similarly, this should only catch AttributeError.
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 4 (+46 -38) |
