diff --git a/src/47-cuda-events/README.md b/src/47-cuda-events/README.md index 4bd54dc..1927b93 100644 --- a/src/47-cuda-events/README.md +++ b/src/47-cuda-events/README.md @@ -469,7 +469,7 @@ cudaMemcpyD2H: 383.66 µs cudaFree: 0.00 µs ``` -The tracer adds about 2us overhead to each CUDA API call, which is negligible for most cases. +The tracer adds about 2us overhead to each CUDA API call, which is negligible for most cases. To further reduce the overhead, you can try using the [bpftime](https://github.com/eunomia-bpf/bpftime) userspace runtime to optimize the eBPF program. ## Command Line Options diff --git a/src/47-cuda-events/README.zh.md b/src/47-cuda-events/README.zh.md index 2fc6d4f..0d85fd9 100644 --- a/src/47-cuda-events/README.zh.md +++ b/src/47-cuda-events/README.zh.md @@ -469,7 +469,7 @@ cudaMemcpyD2H: 383.66 µs cudaFree: 0.00 µs ``` -追踪器为每个CUDA API调用增加了约2微秒的开销,这对大多数情况来说是可以忽略不计的。 +追踪器为每个CUDA API调用增加了约2微秒的开销,这对大多数情况来说是可以忽略不计的。为了进一步减少开销,你可以尝试使用[bpftime](https://github.com/eunomia-bpf/bpftime)用户空间运行时来优化eBPF程序。 ## 命令行选项