Posted (Feb. 11, 2011, 3:03 p.m.)
There are a number of problems with the code, but without getting into that, I think this is solving the problem from the wrong angle. I actually ran into what I think is the same thing with X86 recently, and it's correct for O3 to mark this as a mispredict in the middle of the macroop. The following microops will see the wrong pc state, although they may not care, and the pc will diverge at the end just as much as it will in the middle. The problem is that O3 will clobber the adjusted npc because the getExtMachInst function for ARM will do what it did for x86 and always adjust npc to point to the end of the instruction it just fetched. That's fine if the pc is out of the blue since npc is just a guess at that point, but if the pc was corrected for a mispredict (or anything else that causes refetching mid macroop) then the new, purposeful npc gets blasted. In x86, I have a patch out that handles that by explicitly tracking the size of the current instruction in the pc. That gets cleared when the PC advances, and if it's not set (ie. a size of 0) then the npc needs to be adjusted since the size was previously unknown. If the size was not 0, then this is a pc where the size was already taken into account and the npc is whatever it is on purpose.
Posted (Feb. 24, 2011, 6:56 a.m.)
This is actually better than my initial impression suggested. I think I saw all the lines from the Ra->URa change and got confused since those didn't have anything to do with RFE. I jumped to the conclusion that you guys had some big, elaborate, round about implementation for this when really you have two changes stuck together. I'm not 100% sure you're safe with out the PC change even if you rearrange things this way. What's going to happen is the NPC is going to be forced to PC + size of the instruction after every instruction is generated, even if that's wrong and is corrected by a branch mispredict. The problem manifests itself functionally when you have a mispredict and don't immediately move the NPC into the PC, preserving the correct control flow. That happened here because you were staying on the same inst to do more microops. I think most of the time if you put the branching microop last you'll get away with it, but I have a creeping doubt that there's some other corner case where it'll get you in trouble. I do agree that the ARM PC state is pretty complicated and it would be nice to slim it down somehow, or at least not make it worse.
-
src/arch/arm/insts/mem.hh (Diff revision 1) -
If these are fixed values you don't need to store them in the class. You can just define operands that always use those particular indexes. That said, it would probably be best to pass in indexes and make them fixed outside of this class. That makes it more obvious what values are being used and makes it easier to allow them to change in the future.
-
src/arch/arm/isa/operands.isa (Diff revision 1) -
There isn't anything inherently wrong with changing these names, but it doesn't belong in this patch. It causes a lot of code to change that doesn't have anything to do with RFE.
-
src/arch/arm/isa/operands.isa (Diff revision 1) -
This one doesn't seem to actually be used.
