O3CPU: Revive cachePorts per-cycle dcache access limit
Review Request #1872 - Created May 16, 2013 and updated
| Information | |
|---|---|
| Erik Tomusk | |
| gem5 | |
| default | |
| Reviewers | |
| Default | |
Changeset 9722:7026fe0f45b4 --------------------------- O3CPU: Revive cachePorts per-cycle dcache access limit This is a stop-gap patch to place a limit on the number of dcache requests the LSQUnit sends each cycle. Currently, the LSQUnit will send any number of requests, leading to unrealistic dcache usage. Note that there is an LSQUnit for each hardware thread, so the cachePorts limit is enforced on a per-thread basis. What this patch does NOT do: *Limit icache accesses *Limit dcache accesses from sources other than the LSQUnit (e.g. accesses from L2) I'd like to refactor the second half of LSQUnit<Impl>::read(), as it's very messy. It would be helpful to get feedback on whether what it does is functionally correct before I do. It would also be helpful if someone who understands split memory accesses could check if that bit of code is correct, since I don't know how to test it.
When cachePorts is set to 200 (the old value), this patch passes ARM/tests/fast/long with the exception that the regression complains about the new statistic.
Issue Summary
7
6
1
0
| Description | From | Last Updated | Status |
|---|---|---|---|
| Data ports? | Andreas Hansson | June 6, 2013, 3:10 a.m. | Open |
| const unsigned int? | Andreas Hansson | June 6, 2013, 3:10 a.m. | Open |
| == rather than >=? | Andreas Hansson | June 6, 2013, 3:10 a.m. | Open |
| When is this ever decremented? How do we link a decrement of this with getting things going again (or do ... | Andreas Hansson | June 6, 2013, 3:10 a.m. | Open |
| \n at the end? | Andreas Hansson | June 6, 2013, 3:10 a.m. | Open |
| Is the port not also "used" on a blocked request that is waiting for a retry? | Andreas Hansson | June 6, 2013, 3:10 a.m. | Open |
Posted (May 23, 2013, 1:10 p.m.)
Thanks for this Erik, My only issue with the way this is handled, is if the CPU runs out of cache ports, I'm pretty sure it will squash the entire pipeline and start re-fetching, which doesn't seem right to me. Anyone else? Thanks, Ali
-
src/cpu/o3/lsq_unit_impl.hh (Diff revision 1) -
Can this just made to be unsigned so <0 isn't possible?
Review request changed
Updated (June 4, 2013, 10:48 p.m.)
Description: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 2 (+52 -19) |
Review request changed
Updated (June 4, 2013, 11:13 p.m.)
Description: |
|
|---|
Posted (June 6, 2013, 2:49 a.m.)
-
src/cpu/o3/lsq_unit.hh (Diff revision 2) -
This implies that both halves of an unaligned access have to go on the same cycle... is my interpretation accurate, and if so, does this restriction make sense? And if it does, why don't we check up front that we have two free ports before sending the first half? Offhand, I don't see any reason for this restriction. I can't find anything in the official docs, but some googling indicates that unaligned memory accesses are not guaranteed to be atomic, and sending both halves in the same cycle doesn't guarantee atomicity anyway (maybe if both halves are in the same cache line, but definitely not otherwise).
Posted (June 6, 2013, 3:10 a.m.)
Just a random thought. This patch is a good step in the right direction, but why don't we simply use a vector master port for the D side and cycle through them round robin (or pick which ever is free)?
-
src/cpu/o3/O3CPU.py (Diff revision 2) -
Data ports?
-
src/cpu/o3/lsq_unit.hh (Diff revision 2) -
const unsigned int?
-
src/cpu/o3/lsq_unit.hh (Diff revision 2) -
== rather than >=?
-
src/cpu/o3/lsq_unit.hh (Diff revision 2) -
When is this ever decremented? How do we link a decrement of this with getting things going again (or do we simply keep on trying and fail until the number is reduced)?
-
src/cpu/o3/lsq_unit_impl.hh (Diff revision 2) -
\n at the end?
-
src/cpu/o3/lsq_unit_impl.hh (Diff revision 2) -
Is the port not also "used" on a blocked request that is waiting for a retry?
