1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
|
#!/usr/bin/env python
# SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: LicenseRef-NvidiaProprietary
#
# NVIDIA CORPORATION, its affiliates and licensors retain all intellectual
# property and proprietary rights in and to this material, related
# documentation and any modifications thereto. Any use, reproduction,
# disclosure or distribution of this material and related documentation
# without an express license agreement from NVIDIA CORPORATION or
# its affiliates is strictly prohibited.
import nsysstats
class AsyncMemcpyPageable(nsysstats.ExpertSystemsReport):
display_name = 'DEPRECATED - Use cuda_memcpy_async instead'
usage = '{SCRIPT} -- {{DISPLAY_NAME}}'
should_display = False
message_advice = ("The following APIs use PAGEABLE memory which causes"
" asynchronous CUDA memcpy operations to block and be executed"
" synchronously. This leads to low GPU utilization.\n\n"
"Suggestion: If applicable, use PINNED memory instead.")
message_noresult = ("There were no problems detected related to memcpy"
" operations using pageable memory.")
query_async_memcpy_pageable = """
WITH
sid AS (
SELECT
*
FROM
StringIds
WHERE
value LIKE 'cudaMemcpy%Async%'
),
memcpy AS (
SELECT
*
FROM
CUPTI_ACTIVITY_KIND_MEMCPY
WHERE
srcKind == 0
OR dstKind == 0
)
SELECT
memcpy.end - memcpy.start AS "Duration:dur_ns",
memcpy.start AS "Start:ts_ns",
msrck.label AS "Src Kind",
mdstk.label AS "Dst Kind",
memcpy.bytes AS "Bytes:mem_B",
(memcpy.globalPid >> 24) & 0x00FFFFFF AS "PID",
memcpy.deviceId AS "Device ID",
memcpy.contextId AS "Context ID",
memcpy.streamId AS "Stream ID",
sid.value AS "API Name",
memcpy.globalPid AS "_Global ID",
memcpy.copyKind AS "_Copy Kind",
'cuda' AS "_API"
FROM
memcpy
JOIN
sid
ON sid.id == runtime.nameId
JOIN
main.CUPTI_ACTIVITY_KIND_RUNTIME AS runtime
ON runtime.correlationId == memcpy.correlationId
LEFT JOIN
ENUM_CUDA_MEM_KIND AS msrck
ON srcKind == msrck.id
LEFT JOIN
ENUM_CUDA_MEM_KIND AS mdstk
ON dstKind == mdstk.id
ORDER BY
1 DESC
LIMIT {ROW_LIMIT}
"""
table_checks = {
'CUPTI_ACTIVITY_KIND_RUNTIME':
"{DBFILE} could not be analyzed because it does not contain the required CUDA data."
" Does the application use CUDA runtime APIs?",
'CUPTI_ACTIVITY_KIND_MEMCPY':
"{DBFILE} could not be analyzed because it does not contain the required CUDA data."
" Does the application use CUDA memcpy APIs?",
'ENUM_CUDA_MEM_KIND':
"{DBFILE} does not contain ENUM_CUDA_MEM_KIND table."
}
def setup(self):
err = super().setup()
if err != None:
return err
self.query = self.query_async_memcpy_pageable.format(ROW_LIMIT = self._row_limit)
if __name__ == "__main__":
AsyncMemcpyPageable.Main()
|