1234567891011121314151617181920212223 |
- Motivation/Problem
- Current generation GPUs are capable of processing several TFLOP/s which causes
- I/O bottlenecks in applications with large bandwidth and low computational
- requirements. Moreover, applications that process data from external sources
- such as a frontend FPGA are affected twice by this problem because data first
- has to be transferred into main system memory via CPU transfers before being
- moved to the GPU for final operation in a second transfer.
- Method/solution
- To remedy this problem, we designed and implemented a system architecture
- comprising a custom FPGA board with a flexible DMA transfer policy and a
- heterogeneous compute framework receiving data using AMD's DirectGMA
- OpenCL extension.
- Results
- Conclusion
- With our proposed system architecture we are able to sustain the bandwidth
- requirements of various applications such as real-time tomographic image
- reconstruction and signal analysis with a peak FPGA-GPU throughput of XXX GB/s.
|