Review Board 2.0.15


Ruby: Fixes MESI CMP directory protocol

Review Request #420 - Created Jan. 10, 2011 and submitted

Information
Nilay Vaish
gem5
default
Reviewers
Default
ali, gblack, nate, stever
Ruby: Fixes MESI CMP directory protocol
The current implementation of MESI CMP directory protocol is broken.
This patch, from Arkaprava Basu, fixes the protocol.

   
Posted (Jan. 11, 2011, 2:10 a.m.)
After you delete the commented out lines, please check it in.  I was leery of silently dropping those L1_PUTXes and I'm glad to see the L2 no longer does that.
Please delete these commented out lines
Posted (Jan. 12, 2011, 9:04 a.m.)



  
  1. I don't know why the comment went missing. I'll post the question again.
    I think the states MT_SB and MT_IB are not required. In fact, I am not 
    sure why unblock messages have to be sent out.
  2. I'm confused.  I just asked for the commented out transition to be deleted.  I'm not sure what comment you're referring to.
    
    When you say the unblock messages, I assume you are referring to the WB_acks, right?  I believe the acks need to be sent out because you want to block the L1 until it receives the acks.  Otherwise, sinking random writeback acks can be confusing and lead to several bugs, which currently this protocol definitely has.  There may be further optimizations we can make, such as removing the MT_SM and MT_IB states, as well as possibly combining L1_PUTX and L1_PUTX_old events.  However, I suggest making those optimizations in a separate patch.  In my opinion, right now the number one priority is to fix this protocol as soon as possible.  Otherwise when checked in, my series of patches will expose several bugs in the protocol and thus break the regression tester.
  3. I am confused too. I think Nilay is talking about "UNBLOCK" messages that are sent when there is exclusive owner change for a cache block or a cache block is doing a M->S transition. In general, directory cache coherence protocols uses blocking states (in this protocol the states whose name ends with "B") to constrain amount of races as blocking helps in serializing requests. In theory you can get rid of UNBLOCKS through more transient states and/or through NACKs. 
    For this particular case that Nilay mentioned here, the MT_SB and MT_IB states are used to make sure modified data is not lost when there is coherence permission down-gradation at the exclusive owner's cache due to request from other cores. 
    I also totally agree with Brad that the purpose of this patch is to fix a broken protocol not to do optimization. And unlike MOESI_CMP_directory, this protocol is targeted to work as simple baseline protocol than highly optimized one.
    
    Thanks
    Arka     
  4. Arka's right, I was talking about those very 'UNBLOCK' messages. I
    agree this discussion is unrelated to the changes necessary for fixing
    the protocol. When I fixed the protocol, I was trying to reason why
    those states might have been added. I think the state MT_IIB makes 
    sure that the L2 cache receives the modified data from the L1 cache. Once the 
    data has been received, the L2 cache state should directly move to SS.
Posted (Jan. 13, 2011, 12:29 p.m.)
I am going to commit this patch soon. Along with the patch, I am also
thinking of updating the reference outputs for the four regression tests
done on MESI_CMP_directory. Currently, 60.rubytest fails on running the tests.