Clock: Rework clocks to avoid tick-to-cycle transformations
Review Request #1321 - Created July 25, 2012 and submitted
| Information | |
|---|---|
| Andreas Hansson | |
| gem5 | |
| default | |
| Reviewers | |
| Default | |
Changeset 9177:458faeb1b6d6 --------------------------- Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.
util/regress all passing (disregarding t1000 and eio) A minor update. This change did improve performance. Running the full regression, including a clean compile of all the ISAs went down by 8%. Note that this includes the time for building as well.
Issue Summary
| Description | From | Last Updated | Status |
|---|---|---|---|
| This obviously doesn't work if clock is not a constant... something like this would be more general (assuming I got ... | Steve Reinhardt | July 28, 2012, 4:48 a.m. | Open |
| This seems to be an absolute cycle. Do we no longer require that cycles are relative? We ensure that cycles ... | Nathan Binkert | Aug. 23, 2012, 1:35 a.m. | Open |
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Testing Done: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Diff: |
Revision 2 (+138 -86) |
Change Summary:
Added a note on the performance difference.
Testing Done: |
|
|---|
this sort of thing worries me because of the possibilities of introducing subtle bugs, but it looks like a good change long term.
-
src/sim/clocked_object.hh (Diff revision 2) -
This obviously doesn't work if clock is not a constant... something like this would be more general (assuming I got it right): tick += divCeil(curTick() - tick, clock) * clock; of course that's not a problem now, but it might be better to plan for that rather than having this be yet another thing that needs to be fixed later.
-
src/cpu/simple/timing.cc (Diff revision 2) -
It looks like here you've converted a delay in ticks into a delay in cycles.
-
src/cpu/simple/timing.cc (Diff revision 2) -
Same here.
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 3 (+160 -114) |
-
src/sim/clocked_object.hh (Diff revision 2) -
Sorry not to be more clear. I meant that both lines 92 and 93 (the cycle and tick assignments) should be replaced. The line I provided was meant to come first and show how tick would need to be updated based on its current value. I was leaving the corresponding update of cycle as an exercise for the reader ;-). Actually at that point I was probably in my phase where I was hoping that we could eliminate cycle entirely. If we do keep cycle, it would be more like: CycleCount elapsedCycles = divCeil(curTick() - tick, clock); cycle += elapsedCycles; tick += elapsedCycles * clock;
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 4 (+162 -112) |
Overall this looks great. It would be nice if we could figure out how to make things like trapLatency below have their latencies depend on other clocks. That way if you change the frequency of one clock all dependent clocks get updated automatically. This is along the lines of what Joel was e-mailing about.
-
src/arch/x86/mmapped_ipr.hh (Diff revision 4) -
Why is the return value here 1 when the return type is still Tick?
-
src/arch/x86/mmapped_ipr.hh (Diff revision 4) -
ditto
-
src/cpu/inorder/cpu.cc (Diff revision 4) -
This seems to be an absolute cycle. Do we no longer require that cycles are relative? We ensure that cycles are always increasing?
-
src/cpu/o3/O3CPU.py (Diff revision 4) -
Be nice to have a Param type that specifically indicates Cycles, but I understand why not.
-
src/cpu/o3/commit.hh (Diff revision 4) -
Do we allow uint as a type? Seems like we should use unsigned since we don't use uint elsewhere.
-
src/sim/clocked_object.hh (Diff revision 4) -
I'd like these comments to be just a tad more informative. How's this: // The tick value of the next clock edge (>= curTick()) at the time of the last call to update() mutable Tick tick; // The cycle counter value corresponding to the current value of 'tick' mutable Tick cycle;
-
src/sim/clocked_object.hh (Diff revision 4) -
Since tick and clock correspond to the *next* clock edge, I think this '==' could be '>='. It wouldn't change the results at all (I'm pretty sure) but would avoid the divide in more cases. Also, re: your problem with tick += clock being outside the if -- I'd say the real problem is that between tick += clock and cycle += 1, one was outside the if and the other was inside. So there's an alternate solution like this: tick += clock; cycle += 1; if (tick >= curTick()) { return; } The elapsedCycles code doesn't have to change at all, though the value of elapsedCycles would now be smaller by one than the value with the current code. I'm not sure which version is better; I thought of this new one in the process of trying to eliminate the redundant addition of tick to clock that's in the current code, though (1) I'd think a good compiler could eliminate that anyway and (2) it's just an integer add so who cares. Feel free to pick whichever version you prefer.
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 5 (+168 -112) |
So my hypothesis was correct? Nice... thanks for checking! I'll just mention that there's a minor inconsistency in that one of the 'if (tick >= curTick())' checks has braces and the other doesn't. Do what you will with that. No need to repost.
