Andreas Hansson got review request #2332!
cpu: Fix cache blocked load behavior in o3 cpu
Review Request #2332 - Created Aug. 13, 2014 and submitted
| Information | |
|---|---|
| Andreas Hansson | |
| gem5 | |
| default | |
| Reviewers | |
| Default | |
Changeset 10312:1529a4fc718f --------------------------- cpu: Fix cache blocked load behavior in o3 cpu This patch fixes the load blocked/replay mechanism in the o3 cpu. Rather than flushing the entire pipeline, this patch replays loads once the cache becomes unblocked. Additionally, deferred memory instructions (loads which had conflicting stores), when replayed would not respect the number of functional units (only respected issue width). This patch also corrects that. Improvements over 20% have been observed on a microbenchmark designed to exercise this behavior.
Posted (Aug. 16, 2014, 9:01 a.m.)
Two points that I would like to make: * The opening comment in the patch states that it is trying to do two things. I would suggest that we split the patch. * I think we should not drop the original behaviour. Firstly, it was not incorrect. Secondly, no reason has been provided as to why the behaviour implemented should be preferred. Are we sure that most out-of-order processors would choose the proposed over the original?
Posted (Aug. 19, 2014, 10:11 a.m.)
I'll take Mitch and Stephen's word that we need this patch. I'll still like to understand it in more detail before we commit it.
-
src/cpu/o3/inst_queue.hh (Diff revision 1) -
I think we need better differentiation between this list and the one declared after it. On further reading, it seems that we may not need the two lists. Can we just mark the instructions that they should be retried? While adding them back to the ready queue, we can check which ones are marked. Or may be keep an iterator that tracks the point till which we should retry. One more thought, can we do with a queue instead of a list?
-
src/cpu/o3/inst_queue_impl.hh (Diff revision 1) -
Should we not clear retryMemInsts as well?
-
src/cpu/o3/inst_queue_impl.hh (Diff revision 1) -
Let's retain the new line above this while loop.
-
src/cpu/o3/inst_queue_impl.hh (Diff revision 1) -
We should use nullptr now that we have gcc minimum dependency at 4.6.
-
src/cpu/o3/lsq_unit.hh (Diff revision 1) -
Do we need all these changes that appear over next 15-20 lines? It seems from my initial reading that the previous code structure could have been retained.
-
src/cpu/o3/lsq_unit_impl.hh (Diff revision 1) -
New line after.
Review request changed
Updated (Aug. 27, 2014, 9:13 a.m.)
Summary: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Description: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
Diff: |
Revision 2 (+203 -256) |
