cpu, o3: Ignored invalidate causing same-address load reordering
Review Request #2502 - Created Nov. 16, 2014 and submitted
| Information | |
|---|---|
| Andreas Hansson | |
| gem5 | |
| default | |
| Reviewers | |
| Default | |
Changeset 10576:66ef592c85ac --------------------------- cpu, o3: Ignored invalidate causing same-address load reordering In case the memory subsystem sends a combined response with invalidate (e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part of the response. If we were to ignore the invalidate part, under certain circumstances this effectively leads to reordering of loads to the same address which is not permitted under any memory consistency model implemented in gem5. Consider the case where a later load's address is computed before an earlier load in program order, and is therefore sent to the memory subsystem first. At some point the earlier load's address is computed and in doing so correctly marks the later load as a possibleLoadViolation. In the meantime some other node writes and sends invalidations to all other nodes. The invalidation races with the later load's ReadResp, and arrives before ReadResp and is deferred. Upon receipt of the ReadResp, the response is changed to ReadRespWithInvalidate, and sent to the CPU. If we ignore the invalidate part of the packet, we let the later load read the old value of the address. Eventually the earlier load's ReadResp arrives, but with new data. As there was no invalidate snoop (sunk into the ReadRespWithInvalidate), and if we did not process the invalidate of the ReadRespWithInvalidate, we obtain a load reordering. A similar scenario can be constructed where the earlier load's address is computed after ReadRespWithInvalidate arrives for the younger load. In this case hitExternalSnoop needs to be set to true on the ReadRespWithInvalidate, so that upon knowing the address of the earlier load, checkViolations will cause the later load to be squashed. Finally we must account for the case where both loads are sent to the memory subsystem (reordered), a snoop invalidate arrives and correctly sets the later loads fault to ReExec. However, before the CPU processes the fault, the later load's ReadResp arrives and the writeback discards the outstanding fault. We must add a check to ensure that we do not skip any unprocessed faults.
Issue Summary
2
1
1
0
| Description | From | Last Updated | Status |
|---|---|---|---|
| But are you not losing the fault set by completeAcc()? | Nilay Vaish | Nov. 23, 2014, 3:56 a.m. | Open |
Posted (Nov. 23, 2014, 3:56 a.m.)
In the last paragraph for the commit message, you metion "second load". Replace it with "later load" since that is what you use in the rest of the message.
-
src/arch/alpha/locked_mem.hh (Diff revision 1) -
Can we put this and the next hunk in a separate patch? It should be asserted in the commit message and the necessary source files that each ISA which implements these functions needs to compute the snoop address in the following manner.
-
src/cpu/o3/lsq_impl.hh (Diff revision 1) -
But are you not losing the fault set by completeAcc()?
Review request changed
Updated (Nov. 24, 2014, 4:21 a.m.)
Description: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 2 (+37 -3) |
Ship It!
