File: README.md

package info (click to toggle)
nvidia-cuda-samples 12.4.1~dfsg-1
  • links: PTS, VCS
  • area: contrib
  • in suites: forky, sid, trixie
  • size: 313,216 kB
  • sloc: cpp: 82,042; makefile: 53,971; xml: 15,381; ansic: 8,630; sh: 91; python: 74
file content (12 lines) | stat: -rw-r--r-- 695 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
# 6. Performance


### [alignedTypes](./alignedTypes)
A simple test, showing huge access speed gap between aligned and misaligned structures. It measures per-element copy throughput for aligned and misaligned structures on big chunks of data.

### [transpose](./transpose)
This sample demonstrates Matrix Transpose.  Different performance are shown to achieve high performance.

### [UnifiedMemoryPerf](./UnifiedMemoryPerf)
This sample demonstrates the performance comparision using matrix multiplication kernel of Unified Memory with/without hints and other types of memory like zero copy buffers, pageable, pagelocked memory performing synchronous and Asynchronous transfers on a single GPU.