Joel Hestness got review request #3773!

Information
Submitter:	Joel Hestness
Repository:	gem5
Branch:	default
Bugs:
Depends On:
Reviewers
Groups:	Default
People:

Description

Changeset 11802:ca5c5b982ea5
---------------------------
ruby: PerfectSwitch add assured access arbitration

When operating near bandwidth saturation and using finite cache hierarchy
buffering, the round-robin arbitration in the PerfectSwitch caused low ID
input buffers to gain access to the switch more frequently than other input
buffers that might contain requests. This resulted from the priority cycling
starting on input buffers with no pending requests and cycling around to the
low ID buffers with pending requests. Part of the problem was that
input-to-output port allocation was done on-the-fly while cycling through
input ports.

To fix this, refactor the PerfectSwitch to remove on-the-fly arbitration, and
better delineate port allocation from switch traversal. Then, implement
cycling-priority assured access arbitration using output port request batches
to ensure that all input ports are given the same priority when buffers are
full.

This fix reduces GPU core progress asymmetry from >3x down to <12%, and in
line with hardware.

Testing Done

Extensive testing and use in gem5-gpu. Used GPU to saturate cache hierarchy
bandwidth, and tracked threadblock progress to witness asymmetry. Repeated
this testing after the fix to see greatly reduced asymmetry. Also, in these
small tests, simulator run time improves slightly due to reduced amount of
work performed by PerfectSwitch arbitration. Also, have run thousands of
simulations with this patch to verify that the changes work for a wide
range of simulated system behaviors.

Issue Summary

Description	From	Last Updated	Status
In your comment, please explain why this is a three dimensional vector, rather than just a two dimensional one vnet ...	Brad Beckmann	Jan. 24, 2017, 10:52 p.m.	Resolved
Is it possible to pull this loop into a separate function? This is quite a complicated, long while loop. It ...	Brad Beckmann	Jan. 24, 2017, 10:52 p.m.	Resolved

Description:

~		Changeset 11786:c9937aad9a53
	~	Changeset 11786:93f0e3b78f2d

		ruby: PerfectSwitch add assured access arbitration

		When operating near bandwidth saturation and using finite cache hierarchy
		buffering, the round-robin arbitration in the PerfectSwitch caused low ID
		input buffers to gain access to the switch more frequently than other input
		buffers that might contain requests. This resulted from the priority cycling
		starting on input buffers with no pending requests and cycling around to the
		low ID buffers with pending requests. Part of the problem was that
		input-to-output port allocation was done on-the-fly while cycling through
		input ports.

		To fix this, refactor the PerfectSwitch to remove on-the-fly arbitration, and
		better delineate port allocation from switch traversal. Then, implement
		cycling-priority assured access arbitration using output port request batches
		to ensure that all input ports are given the same priority when buffers are
		full.

		This fix reduces GPU core progress asymmetry from >3x down to <12%, and in
		line with hardware.

Diff:

Revision 2 (+318 -160)

Show changes

	src/mem/ruby/network/simple/PerfectSwitch.hh
	src/mem/ruby/network/simple/PerfectSwitch.cc

Testing Done:

	+	Extensive testing and use in gem5-gpu. Used GPU to saturate cache hierarchy
	+	bandwidth, and tracked threadblock progress to witness asymmetry. Repeated
	+	this testing after the fix to see greatly reduced asymmetry. Also, in these
	+	small tests, simulator run time improves slightly due to reduced amount of
	+	work performed by PerfectSwitch arbitration. Also, have run thousands of
	+	simulations with this patch to verify that the changes work for a wide
	+	range of simulated system behaviors.

Overall this patch looks really good. I'm sure it helps out GPU simulations quite a bit. I do have a few questions/comments I would like answered/addressed before I give it a ship it.

Joel Hestness Jan. 24, 2017, 10:53 p.m. (Jan. 24, 2017, 10:53 p.m.)
```
Agreed on your suggestions. I've updated the patch.
```

src/mem/ruby/network/simple/PerfectSwitch.hh (Diff revision 2)

In your comment, please explain why this is a three dimensional vector, rather than just a two dimensional one vnet x input port. Based on the current comment, I would have thought you only had to maintain this bit vector for each vnet's input port, rather than the vnet input/output combination.

Show all issues

src/mem/ruby/network/simple/PerfectSwitch.cc (Diff revision 2)

Minor question, but wouldn't a 'return' be more appropriate than a 'break'?

src/mem/ruby/network/simple/PerfectSwitch.cc (Diff revision 2)

Is it possible to pull this loop into a separate function? This is quite a complicated, long while loop. It would be nice to break it up and make it more readable.

Show all issues

Description:

~		Changeset 11786:93f0e3b78f2d
	~	Changeset 11802:ca5c5b982ea5

		ruby: PerfectSwitch add assured access arbitration

		When operating near bandwidth saturation and using finite cache hierarchy
		buffering, the round-robin arbitration in the PerfectSwitch caused low ID
		input buffers to gain access to the switch more frequently than other input
		buffers that might contain requests. This resulted from the priority cycling
		starting on input buffers with no pending requests and cycling around to the
		low ID buffers with pending requests. Part of the problem was that
		input-to-output port allocation was done on-the-fly while cycling through
		input ports.

		To fix this, refactor the PerfectSwitch to remove on-the-fly arbitration, and
		better delineate port allocation from switch traversal. Then, implement
		cycling-priority assured access arbitration using output port request batches
		to ensure that all input ports are given the same priority when buffers are
		full.

		This fix reduces GPU core progress asymmetry from >3x down to <12%, and in
		line with hardware.

Diff:

Revision 3 (+321 -149)

Show changes

	src/mem/ruby/network/simple/PerfectSwitch.hh
	src/mem/ruby/network/simple/PerfectSwitch.cc

Ship It!

You have a pending review.

Review Board 2.0.15

Joel Hestness got review request #3773!

ruby: PerfectSwitch add assured access arbitration

Screenshots

Files

Issue Summary

Description:

Changeset 11786:c9937aad9a53

Changeset 11786:93f0e3b78f2d

Diff:

Testing Done:

Description:

Changeset 11786:93f0e3b78f2d

Changeset 11802:ca5c5b982ea5

Diff: