Matrix Dimension
C++
OpenMP
CUDA
16
0.112s
0.109s
3.785s
32
0.662s
0.582s
6.556s
256
1691.884s
566.767s
128.19s