O3: Enhance data address translation by supporting hardware page table walkers.
Review Request #422 - Created Jan. 12, 2011 and submitted
| Information | |
|---|---|
| Ali Saidi | |
| gem5 | |
| Reviewers | |
| Default | |
| ali, gblack, nate, stever | |
O3: Enhance data address translation by supporting hardware page table walkers. Some ISAs (like ARM) relies on hardware page table walkers. For those ISAs, when a TLB miss occurs, initiateTranslation() can return with NoFault but with the translation unfinished. Instructions experiencing a delayed translation due to a hardware page table walk are deferred until the translation completes and kept into the IQ. In order to keep track of them, the IQ has been augmented with a queue of the outstanding delayed memory instructions. When their translation completes, instructions are re-executed (only their initiateAccess() was already executed; their DTB translation is now skipped). The IEW stage has been modified to support such a 2-pass execution.
Posted (Jan. 16, 2011, 4:28 p.m.)
This change seems to have some dead functionality in it. The "delay" member added to translations is never used, the "delay()" pure virtual function is never used and not defined for any other ISAs (which I think will break them all), and the translationDelayed variable is never used. Unnecessary copyright changes should also be rolled back when removing that dead code. Are there any instructions that actually expect to get a valid fault when performing initiateAcc? I could imagine there might be since it used to be something you could always do for the most part, but getting rid of those instances would help simplify this code I think. The code that prepares a request for the memory system could follow from the finishTranslation function no matter if it happened immediately or after a table walk. Then an instruction would (hopefully) only require one pass since it's work would be done and it would either be ready to commit (a store) or ready for completeAcc (a load). The actual load/store queue would have to wait since the address might not be ready, but that might be a pretty simple extension on top of waiting for initiateAcc to happen.
-
src/sim/tlb.hh (Diff revision 1) -
Did you compile an ISAs other than ARM? I don't see this being defined for any of them, and since it's pure virtual I don't think they'll work any more.
Posted (Jan. 27, 2011, 7:11 a.m.)
Any further comments on this? It doesn't alter the regressions at all, so I would like to commit it, even if we change initiateAcc() to not return a fault in the future. Ali
-
src/cpu/simple/timing.hh (Diff revision 1) -
I'm pretty sure I'm going to use this for my dtlb followed by itlb walk issue in the timing simple cpu.
-
src/cpu/translation.hh (Diff revision 1) -
It's used right here.
-
src/sim/tlb.hh (Diff revision 1) -
All the regressions still pass... This isn't the base tlb object, it's the translation object so yes, it's been added to all the cpu models.
Posted (Jan. 28, 2011, 12:14 a.m.)
-
src/arch/arm/tlb.cc (Diff revision 1) -
I think markDelayed() would be a much better name than delay() for this function, since the function itself doesn't do or cause any delaying.
-
src/cpu/o3/iew_impl.hh (Diff revision 1) -
What's the difference between (Started && !Completed) and Delayed? It seems like there's only a very small window of time where the former doesn't imply the latter; if you're explicitly doing something in that window where the difference matters, it would be good to comment that. If you're not, then we probably don't need three different variables.
-
src/cpu/o3/inst_queue.hh (Diff revision 1) -
typo in comment ("deferred" not "referred")
