gpu: AMD's baseline GPU model
Review Request #3189 - Created Oct. 30, 2015 and submitted
| Information | |
|---|---|
| Tony Gutierrez | |
| gem5 | |
| default | |
| Reviewers | |
| Default | |
Changeset 11282:118397102cfe --------------------------- gpu: AMD's baseline GPU model This patch includes the entire baseline GPU model from AMD.
Impressive!
Should it not be src/dev/gpu rather than src/gpu?
Also, could you please make it a subdirectory in the gpu dir (as from above). There are more GPUs in this world...in fact there is already the NoMali gpu in src/dev/arm (we could move that if needed). If it's x86 specific, then perhaps src/dev/x86/gpu?
This is indeed very impressive! However, as it is, this patch is basically the definition of unreviewable. Could you split it into a handful of different patches for review purposes? ReviewBoard makes all kinds of disconcerting creaking noises when I try to browse the massive (almost 2MiB!) patch. Could you split it into a set of different patches, at least for review purposes?
I'd suggest something along these lines:
- Memory system stuff (Ruby)
- GPU memory components (TLB/page table walkers etc.)
- Configuration
- Main build system (i.e., not the scons files in gpu/)
- Pipeline
- Object loader(s)
- ISAQuickly eyeing the patch, I noticed two things:
- The new GPU ISA constant lives in the same header as the CPU ISA. Please put it in a separate header. This will make things easier down the line if we ever decide to reduce the set of files we need to recompile for different ISAs.
- Some classes in gpu/ live in the X86ISA name space. This seems wrong.
Looking forward to seeing the GPU in action! I'm mostly fine with new files being added as-is, assuming they pass the style checker.
Below are notes mostly regarding the changes/additions in existing gem5 code.
-
src/gpu/gpu_mem_packet.cc (Diff revision 1) -
Also noted in the GPU atomics patch: Why not just return a pointer to the type-specific functions rather than dynamically instantiating wrappers around them for each atomic Request?
-
src/mem/protocol/RubySlicc_Exports.sm (Diff revision 1) -
Please describe these more clearly. What is a CorePair? What do "TCP", "TCC", "SQC" stand for? Etc.
*Note: By adding machine types to this enumeration, you are declaring them as globally visible across any protocol that cares to use them. If we allow ambiguous or inaccurate descriptions here, various protocols may use them in different ways that make the intended naming unclear. If too ambiguous, users will resort to adding more names, bloating this enumeration, and causing this file to always have merge conflicts.
-
src/mem/protocol/RubySlicc_Types.sm (Diff revision 1) -
Does declaring this here cause conflicts with the declarations in MOESI_CMP_directory-L2cache.sm and MOESI_CMP_token-L2cache.sm? Even if SLICC is somehow able to manage multiple definitions, it would be good to avoid the redundant declarations. Maybe we could move this change out of this patch to a separate patch that also removes the old declarations?
-
src/mem/ruby/slicc_interface/RubySlicc_ComponentMapping.hh (Diff revision 1) -
This function is the same as map_Address_to_DirectoryNode(). Why replicate it?
-
src/mem/ruby/slicc_interface/RubySlicc_ComponentMapping.hh (Diff revision 1) -
This function is functionally identical to createMachineID. Why add a separate function?
-
src/mem/ruby/structures/CacheMemory.hh (Diff revision 1) -
I don't see where this function is used in any of these patches. Can you please verify that it is necessary?
-
src/mem/ruby/structures/CacheMemory.hh (Diff revision 1) -
I don't see where this function is used in any of these patches. Can you please verify that it is necessary?
-
src/mem/ruby/structures/CacheMemory.hh (Diff revision 1) -
Couple things:
1) I see that initializing variables like this in the class declaration is new C++11 functionality. This kind of initialization is inconsistent with all other class member declarations in gem5, though, and this particular initialization doesn't appear to be necessary (see below).2) The m_block_size variable only appears to be used in CacheMemory::init(). Do we need to add it, or can it just be a local variable in init()?
-
src/mem/ruby/structures/CacheMemory.cc (Diff revision 1) -
This looks like it could result in bugs. m_block_size is initialized to -1 in CacheMemory.hh, and possibly to sizes other than RubySystem::getBlockSizeBytes() in the constructor. Seems like it could either be incorrectly set and not 0 at the time init() is called (e.g. 2), or the initialization to -1 is unnecessary.
1) Why not require that it be set when a CacheMemory is instantiated in Python?
2) or, why not remove the initialization in the class declaration? -
src/mem/ruby/system/GPUCoalescer.hh (Diff revision 1) -
This guy looks an awful lot like a Sequencer, so there is a lot of replicated code between here and the Sequencer. It seems like better abstraction needs to be used. Could GPUCoalescer descend from Sequencer instead, and just overload/add functionality that is different?
I've already noted ramifications this has in the profiler patch (3192). I'm not sure if it's appropriate to address this before checking in the GPU, but if not, it should certainly be addressed shortly after. I'd prefer a plan of attack.
-
src/mem/ruby/system/RubyPort.hh (Diff revision 1) -
Just a note: This is basically the accessor function I am referring to when I suggest that the SLICC generated getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer inherit.
-
src/mem/ruby/system/Sequencer.py (Diff revision 1) -
Why is this default initialized to 16?
-
src/mem/protocol/RubySlicc_Types.sm (Diff revision 1) -
These don't appear to be defined or used anywhere. Can you please remove them?
Description: |
|
||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 2 (+56498 -88) |
Description: |
|
|||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Diff: |
Revision 3 (+56559 -88) |
...and yes, please split the patch to make it reviewable (if that is a word).
-
configs/ruby/MemCntrlConstructor.py (Diff revision 3) -
I thought we concluded that one fixed latency memory controller is enough? I'd suggest to use SimpleMemory
-
SConstruct (Diff revision 3) -
This makes it look like the two are not orthogonal, could you explain/comment?
-
configs/common/GPUTLBConfig.py (Diff revision 3) -
the location of the file, and the name of the functions make it seem quite generic, when in fact it is very specific and makes quite some assumptions
would it be worth putting this is a specific dir and making the names prefixed with apu/compute-gpu/gpgpu or similar?
-
configs/common/GPUTLBConfig.py (Diff revision 3) -
the division of functionality is very hard to figure out. there is a top level somewhere that assembles some parts of the system? could you explain the design here, and how this fits together with the rest of the config system?
-
configs/common/GPUTLBOptions.py (Diff revision 3) -
I think we need some form of prefix, apu/compute-gpu/gpgpu etc.
Most, if not all, existing options do not use camelCase, but instead small caps and dashes.
This is also really exposing the issues we have with the current config system...but that's a different story.
-
configs/common/GPUTLBOptions.py (Diff revision 3) -
Quite a few questions around units here.
-
configs/example/apu_se.py (Diff revision 3) -
the naming seems quite inconsistent with a mix of camelCase and dash-separated-names
-
configs/example/apu_se.py (Diff revision 3) -
any reason why minor is not here?
Why not use CpuConfig?
-
configs/example/apu_se.py (Diff revision 3) -
CPU vs command processor? What is the difference?
-
configs/example/apu_se.py (Diff revision 3) -
What is this based on?
-
configs/example/ruby_gpu_random_test.py (Diff revision 3) -
besides the protocol it is not clear why the normal ruby test script is not used, there seems to be a lot of overlap
-
configs/ruby/GPU_RfO.py (Diff revision 3) -
there seems to be an awful lot of repetition of boilerplate code involved in these scripts, can the common parts not be broken out?
-
configs/ruby/GPU_VIPER_Baseline.py (Diff revision 3) -
again...there is a lot of repetition
these files really need some modularity
Initial pass through the first page
So
-
src/mem/ruby/structures/CacheMemory.cc (Diff revision 3) -
This seems like a piece of functionality that is independent from the compute-gpu. A patch on its own?
-
src/mem/ruby/system/GPUCoalescer.cc (Diff revision 3) -
seems like an enum would be a better option for the request scope
-
src/mem/ruby/system/GPUCoalescer.cc (Diff revision 3) -
same as above but for the segment
-
src/mem/ruby/system/Sequencer.cc (Diff revision 3) -
context ID? What is this core ID, and what is it for?
-
src/mem/ruby/system/Sequencer.py (Diff revision 3) -
Is this GPU related?
Some more.
