ruby: Fixes clock domains in configuration files.
Review Request #2218 - Created March 31, 2014 and submitted
| Information | |
|---|---|
| Emilio Castillo | |
| gem5 | |
| Reviewers | |
| Default | |
This patch fixes the se.py script by adding the ruby clock domain. This would cause the ruby clock domain to be set at 1GHz by default. Running simulations with the cpu clock set at 2GHz or 1GHz will output the same time results and could distort power measurements. The patch also sets the clock domain for each coherence protocol. Now the L1 controllers and the Sequencer shares the cpu clock domain, while the rest of the components use the ruby clock domain. Thanks to Mr. Nilay Vaish for his help while figuring out what was happening.
This was tested using a timing cpu with a code that is not memory or I/O bounded such as:
#include <math.h>
#include <stdio.h>
void main()
{
float var=0.0f;
int i;
for (i=0;i<800000;i++)
var=exp(var);
printf("var %f\n",var);
}
If CPU freq. is halved from 2GHz to 1GHz, execution time is also expected to decrease.
(This has been verified with the classic memory model).
build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 -c ./a.out --cpu-type=timing --caches --cpu-clock=2GHz
sim_seconds 0.076027 # Number of seconds simulated
system.cpu0.numCycles 152035861
build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 -c ./a.out --cpu-type=timing --caches --cpu-clock=1GHz
sim_seconds 0.152036
system.cpu0.numCycles 152054288
However if ruby is used (with the se.py fixed by adding the ruby clock), execution time will be the same at 2GHz and 1GHz for the cpu.
build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 --ruby --num-l2cache=2 --num-dirs=2 -c ./a.out --cpu-clock=2GHz
sim_seconds 0.304070
system.cpu0.numCycles 608140702
build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 --ruby --num-l2cache=2 --num-dirs=2 -c ./a.out --cpu-clock=1GHz
sim_seconds 0.304070 # Number of seconds simulated
system.cpu0.numCycles 304070351
Suppose the cache access cycles are set to 2 cycles at 2GHz, if the CPU freq. is also set to 2GHz then a memory access will take
2 cycles, even for ins. fetch in the simple cpus.
If the CPU freq is now lowered to 1GHz, each memory access to the L1's will take 1 cycle seen from the cpu side.
Ins. fetch will take now 1 Cycle, thus the number of execution cycles will be exactly twice more in the 2GHz cpu.
608140702/304070351=2.0
This patch fixes it with an approach similar taken in the classic memory model, where the l1 controllers are set to the cpu clock domain.
Issue Summary
2
1
1
0
| Description | From | Last Updated | Status |
|---|---|---|---|
| I realise this is orthogonal to this patch, but is there any chance we could factor out the generic bits ... | Andreas Hansson | March 31, 2014, 3:58 a.m. | Open |
Review request changed
Updated (March 31, 2014, 3:59 a.m.)
Description: |
|
|---|
Posted (March 31, 2014, 3:59 a.m.)
-
configs/ruby/MESI_Three_Level.py (Diff revision 1) -
I realise this is orthogonal to this patch, but is there any chance we could factor out the generic bits here so that it does not have to be repeated for every single coherency protocol? This improvement should ultimately happen before this patch goes in. Just a thought...
Changes should be made to the MI_example protocol as well. Once that is done, I'll commit this patch.
Review request changed
Updated (April 22, 2014, 11:37 a.m.)
Posted (May 2, 2014, 11:19 p.m.)
I tried running the regression tests. They fail because some clock
domain you assume is missing in the test configurations. Here is output
from one of the regression tests:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/scratch/nilay/GEM5/gem5/src/python/m5/main.py", line 388, in main
exec filecode in scope
File "tests/run.py", line 180, in <module>
execfile(joinpath(tests_root, 'configs', test_filename + '.py'))
File "tests/configs/simple-timing-ruby.py", line 82, in <module>
Ruby.create_system(options, system)
File "/scratch/nilay/GEM5/gem5/configs/ruby/Ruby.py", line 114, in create_system
% protocol)
File "<string>", line 1, in <module>
File "/scratch/nilay/GEM5/gem5/configs/ruby/MESI_Two_Level.py", line 98, in create_system
clk_domain=system.cpu_clk_domain,
File "/scratch/nilay/GEM5/gem5/src/python/m5/SimObject.py", line 736, in __getattr__
raise AttributeError, err_string
AttributeError: object 'System' has no attribute 'cpu_clk_domain'
(C++ object is not yet constructed, so wrapped C++ methods are unavailable.)
You would have to update the scripts in tests/configs appropriately.
Review request changed
Updated (Aug. 19, 2014, 1:13 p.m.)
Change Summary:
I had some previous comments as a draft and didn't realize I hadn't sent them, sorry about that. Ruby now takes the clock domains for the L1 controllers and sequencers directly accessing the "clock_domain" member in the corresponding cpu object. system.cpu seems to be the standard nomenclature for these objects across all the config and test files. The regressions now run correctly, although the tests/configs/rubytest-ruby.py config has been changed to expose the RubyTester as cpus since the system lacked of the system.cpu declaration. Regressions for ALPHA and quick/se with MESI_Two_Level, MOESI_Hammer and MOESI CMP had been run without errors.
Diff: |
Revision 3 (+17) |
|---|
Posted (Aug. 22, 2014, 8:10 a.m.)
-
tests/configs/rubytest-ruby.py (Diff revision 3) -
I think there is no need for this line of code. There is a line earlier in this file where a System is being created. We should remove the code tester=tester in that line and put cpu=tester in its place.
Review request changed
Updated (Aug. 23, 2014, 9:32 a.m.)
Change Summary:
System creation changed to put the tester object as the cpu.
Diff: |
Revision 4 (+15 -1) |
|---|
