Review Board 2.0.15


Regressions: Start of new regression system

Review Request #834 - Created Aug. 30, 2011 and updated

Information
Ali Saidi
gem5
Reviewers
Default
ali, gblack, nate, stever
Regressions: Start of new regression system

This code is to start a discussion around how to implement the next
regression system for gem5. Attached is a simple example that doesn't
run gem5 yet, but is getting closer. The big question here is how to 
build up configuration files to hopefully minimize duplication of code.

   
Posted (Aug. 31, 2011, 8:10 p.m.)
I probably got more into the details than you're looking for at this point, but I don't have much to say about the larger aspects. I'm not completely sure where this is going as far as the big picture, but I don't see anything to be too worried about yet.
new_tests/conftest.py (Diff revision 1)
 
 
You forgot power.
new_tests/conftest.py (Diff revision 1)
 
 
It's currently quick and long which don't quite go together. quick, medium, and fast almost go together, but quick and fast are the same thing. Also, we're talking about running time primarily and not speed. I think what makes the most sense is short, medium, and long.

Also, you might want to add something for build type (debug, opt, fast).
new_tests/conftest.py (Diff revision 1)
 
 
I'm sure there's a reason, but why can't we have an __init__?
new_tests/conftest.py (Diff revision 1)
 
 
We want to be able to have tests in other directories too, right? Maybe not in v1?
new_tests/conftest.py (Diff revision 1)
 
 
I would really like to see us move away from this hardcoded path since it makes our simulation scripts brittle. This is inherited, though, and somewhat orthogonal.
new_tests/conftest.py (Diff revision 1)
 
 
Yes.
new_tests/conftest.py (Diff revision 1)
 
 
What's the difference between <test> and <canonical_name>?
new_tests/conftest.py (Diff revision 1)
 
 
This feels like a hack. At the very least this name shouldn't be hardcoded, and even better this case should be handled by a wrapper function that prestrips the file contents.
new_tests/conftest.py (Diff revision 1)
 
 
It would be better to have an explicit list than to just grab everything and filter things out.
new_tests/conftest.py (Diff revision 1)
 
 
Hardcoding stats.txt isn't great.
new_tests/conftest.py (Diff revision 1)
 
 
It sounds like you realize this is a limitation, but we'll want to only generate test objects for tests that will work, have reference outputs, run on arch == whatever, etc.
new_tests/conftest.py (Diff revision 1)
 
 
This help text is inconsistent since it's worded as a question and the others are statements. Also, two of them say they can be specified multiple times and two don't. I'm suspecting there isn't actually any difference. This particular option would theoretically accept multiple selections, right?
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
It's a silly nit, but could you space this out? Otherwise the comment looks like it goes with the imports. It obviously doesn't if you look at it for a few seconds, but you have to look at it for a few seconds.
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
Same comment as above.
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
I see what you're doing here and your names make sense when you see what's going on, but the similarity of test_root and tests_root is a little confusing at first. I think there's an os or sys function to split a path into its components. Maybe you could do that and get a slice of it like this [:-1]? I think that would do it. Not a big deal.
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
You dropped an l in call.

Also, won't this not work if it's not at the end? regression_test hasn't been defined yet, and I believe the if will happen right away, not once you get to the end of the file.
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
Why not use curly brace syntax?
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
space in xrange
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
Space after the comma? I'm not sure about that one. Without parenthesis, spacing it out makes it less obvious that you're assigning to all that at once, although it's still easy to tell. Maybe you just need parenthesis.
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
This looks like a job for opt_parse, or whatever it's called.
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
This may not be a good scheme because it forces you to have a particular set of categories that mean particular things. What if I don't have an arch, or cpu, or use gem5 memory, or...

I'm not sure exactly what you're going for here so I don't have any suggestions.
Posted (Sept. 21, 2011, 3:19 p.m.)
Overall this looks like a really good start... given that I don't know pytest at all, it's hard to be confident, but once you get to something that works and we can play around with it we should be able to provide more concrete feedback.  Looks like you're headed in the right direction to me though.
new_tests/conftest.py (Diff revision 1)
 
 
Should canonical_name include sname?  Looks like below wherever you use canonical_name you're explicitly including sname too.
new_tests/conftest.py (Diff revision 1)
 
 
as I mention below, I'd prefer to see us use the existing configs/example scripts by default, so it might be nice to build a command line here that doesn't directly involve the test infrastructure (other than locating inputs and outputs)
  1. So I actually played with this some more recently and got hello world working running from the command line. However, the process convinced me that we don't actually want to run gem5 from the command line via the existing scripts. Perhaps there should be regression tests that do, but they should exist to test the config scripts and not be a requirement for all tests.
    
    The primary reason I came to this conclusion is that we throw a lot of information away when placing things on the command line and we don't get to have error handling in python scripts communicate information back to the simulator in a nice way. For example, we've talked a lot about if a input file doesn't exist we should skip the test (as opposed to reporting pass or fail). If we look at how cpu2000.py handles files not found it raises an exception. This flows back to se.py where we panic on the exception. I think it would be much nicer if the test infrastructure caught the exception when the LiveProcess was being created as opposed to trying to do some string matching on stderr or perhaps defining certain return codes that means "input file not found." 
    
    Thoughts?
  2. My main thought is that I'm glad to see you're back to looking at this again... thank you thank you!
    
    I see your point and it's a good one.  My main thing is not just being able to test the config scripts but perhaps even more avoiding the redundancy of having a bunch of similar-but-different config scripts used only for tests like we do know.  So it's less about invoking se.py or fs.py than making sure that we use the configs/common scripts rather than duplicating their functionality elsewhere.  Ideally we might end up refactoring code between configs/example and configs/common to maximize reuse.
    
    Does that seem like it would work?
new_tests/conftest.py (Diff revision 1)
 
 
this func name is a little misleading... sounds like it might actually run the test here
new_tests/conftest.py (Diff revision 1)
 
 
I don't have a problem with hardwiring standard output filenames like stats.txt or simout... these are going to be run under a controlled environment, so they won't change from test to test, and making them variable just obscures what's going on IMO.
new_tests/conftest.py (Diff revision 1)
 
 
Yea, we can either fork the existing perl script or port it to python... the perl code works, fwiw.
new_tests/hello_world/test_hello.py (Diff revision 1)
 
 
I'd really like to see the regressions use the config scripts in configs/example, if possible; that way those scripts get included in the testing, and we don't have redundant config code just for tests.