printf("Receiving weights using ticks %f\n",timeInterval);
In the test, we launch 131072 times of data transmission from PL to PS. The total size is 8KB*131072=1GB. Note that we do not consider cache flush and invalidate. The result is shown as follows.
1
2
3
Start: 0
End: 144113621
Receiving weights using ticks 1.441280
Based on the above information, we can see that the bandwidth is 1GB/1.441280 = 710.48MB/s.
###Bandwidth from PS to PL Next, we test the bandwidth from PL to PS. The SDK code is shown as follows.
printf("Receiving weights using ticks %f\n",timeInterval);
The result is shown as follows.
1
2
3
Start: 0
End: 106654906
Receiving weights using ticks 1.066656
Then, we can calculate the bandwidth is 1GB/1.066656=960.01MB/s. Note that if we add cache flush in the loop, the elapsed time will be increased to 1.777953 seconds. That is, the bandwidth is 1GB/1.777953=575.94MB/s.