Peer-to-Peer Evolution

GPUDirect

  • Eliminate the need to make a redundant copy in CUDA host memory
  • Eliminate CPU bandwidth and latency bottlenecks

PeerDirect

  • Eliminate the need to make a redundant copy in host memory
  • Direct path for data exchange

PeerDirect Async

  • Control RDMA device from the GPU
  • Reduce CPU utilization