|
@@ -314,8 +314,8 @@ internal FPGA counter with 4 ns resolution. 4) When the generated data is
|
|
|
received again at the FPGA, the counter is stopped. 5) The host program reads
|
|
|
out the counter values and computes the round-trip latency. The distribution of
|
|
|
10000 measurements of the one-way latency is shown in \figref{fig:latency}. The
|
|
|
-GPU latency has a mean value of 168.76 \textmu s and a standard variation of
|
|
|
-12.68 \textmu s. This is 9.73 \% slower than the CPU latency of 153.79 \textmu s
|
|
|
+GPU latency has a mean value of 84.38 \textmu s and a standard variation of
|
|
|
+6.34 \textmu s. This is 9.73 \% slower than the CPU latency of 76.89 \textmu s
|
|
|
that was measured using the same driver and measuring procedure. The
|
|
|
non-Gaussian distribution with two distinct peaks indicates a systemic influence
|
|
|
that we cannot control and is most likely caused by the non-deterministic
|