8 роки тому · 54925c291e
--- a/paper.tex
+++ b/paper.tex
@@ -116,12 +116,14 @@ developed a high-performance DMA engine based on Xilinx's PCIe Gen3 Core.To
 
															 process the data, we encapsulated the DMA setup and memory mapping in a plugin
														
 
															 for our scalable GPU processing framework~\cite{vogelgesang2012ufo}. This
														
 
															 framework allows for an easy construction of streamed data processing on
														
 
															-heterogeneous multi-GPU systems. The framework is based on OpenCL,  and
														
 
															+heterogeneous multi-GPU systems. Because the framework is based on OpenCL,
														
 
															 integration with NVIDIA's CUDA functions for GPUDirect technology is not
														
 
															-possible. We therefore integrated direct FPGA-to-GPU communication into our
														
 
															-processing pipeline using AMD's DirectGMA technology. In this paper we report
														
 
															-the performance of our DMA engine for FPGA-to-CPU communication and some
														
 
															-preliminary measurements about DirectGMA's performance in low-latency applications.
														
 
															+possible at the moment. Thus, we used AMD's DirectGMA technology to integrate
														
 
															+direct FPGA-to-GPU communication into our processing pipeline. In this paper we
														
 
															+report the performance of our DMA engine for FPGA-to-CPU communication and some
														
 
															+preliminary measurements about DirectGMA's performance in low-latency
														
 
															+applications.
														
 
															+
														
 
															 \section{Architecture}
														
@@ -143,7 +145,7 @@ they are not directly involved in the data transfer anymore.
 
															     In a traditional DMA architecture (a), data are first written to the main
														
 
															     system memory and then sent to the GPUs for final processing.  By using
														
 
															     GPUDirect/DirectGMA technology (b), the DMA engine has direct access to
														
 
															-    GPU's internal memory.
														
 
															+    the GPU's internal memory.
														
 
															   }
														
 
															   \label{fig:trad-vs-dgpu}
														
 
															 \end{figure}