Hyper's Rules for Parallelization

  • Rule #1: No random writes to non-local memory
    • Chunk the data, redistribute, and then each core sorts/works on local data.
  • Rule #2: Only perform sequential reads on non-local memory
    • This allows the hardware prefetcher to hide remote access latency.
  • Rule #3: No core should ever wait for another
    • Avoid fine-grained latching or sync barriers.