Review Board 2.0.15


ARM: Mark some variables uncacheable until boot all CPUs are enabled.

Review Request #824 - Created Aug. 9, 2011 and submitted

Information
Ali Saidi
gem5
Reviewers
Default
ali, gblack, nate, stever
ARM: Mark some variables uncacheable until boot all CPUs are enabled.

There are a set of locations is the linux kernel that are managed via
cache maintence instructions until all processors enable their MMUs & TLBs.
Writes to these locations are manually flushed from the cache to main
memory when the occur so that cores operating without their MMU enabled
and only issuing uncached accesses can receive the correct data. Unfortuantely,
gem5 doesn't support any kind of software directed maintence of the cache.
Until such time as that support exists this patch marks the specific cache blocks
that need to be coherent as non-cacheable until all CPUs enable their MMU and
thus allows gem5 to boot MP systems with caches enabled (a requirement for
booting an O3 cpu and thus an O3 MP CPU regression).

   
Posted (Aug. 10, 2011, 4:49 a.m.)
I don't object to this patch.  I'm just curious how hard it would be to get the flushing operations to work.  Shouldn't that be relatively straightforward?  I mean we do support eviction, and we have mechanisms for tracking them.  Perhaps it's just not worth it though.
  1. I don't think a single level would be that bad although it can be tricky. 
    
    The simplest method would be to send a cache maintenance packet (invalidate or flush) to the lower level cache and wait to complete the instruction until after the packet returns and thus the maintenance is done. In the case of a single level of private cache, this would do the trick most of the time. However, at least with ARM, if the invalidation is going to be of a rather large area, the completion can be asynchronous which makes it more complicated. Additionally, there is an issue when more than one level of potentially private caches come into play. The packet then needs to be replicated and propagate down to lower levels and all the responses need to be collected before it is complete. 
    
    In summary, a one off fix to make this particular problem go away with cache flushing/invalidates can be done, but it's not a generic implementation. The latter would be much harder and I'm not sure that the partial implementation is any better than this change.