1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
|
#!/usr/bin/env python
# SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: LicenseRef-NvidiaProprietary
#
# NVIDIA CORPORATION, its affiliates and licensors retain all intellectual
# property and proprietary rights in and to this material, related
# documentation and any modifications thereto. Any use, reproduction,
# disclosure or distribution of this material and related documentation
# without an express license agreement from NVIDIA CORPORATION or
# its affiliates is strictly prohibited.
import nsysstats
class SyncMemset(nsysstats.ExpertSystemsReport):
display_name = 'DEPRECATED - Use cuda_memset_sync instead'
usage = '{SCRIPT} -- {{DISPLAY_NAME}}'
should_display = False
message_advice = ("The following are synchronization APIs that block the"
" host until all issued CUDA calls are complete.\n\n"
"Suggestions:\n"
" 1. Avoid excessive use of synchronization.\n"
" 2. Use asynchronous CUDA event calls, such as cudaStreamWaitEvent()"
" and cudaEventSynchronize(), to prevent host synchronization.")
message_noresult = ("There were no problems detected related to"
" synchronization APIs.")
query_sync_memset = """
WITH
sid AS (
SELECT
*
FROM
StringIds
WHERE
value LIKE 'cudaMemset%'
AND value NOT LIKE '%async%'
),
memset AS (
SELECT
*
FROM
CUPTI_ACTIVITY_KIND_MEMSET
WHERE
memKind == 1
OR memKind == 4
)
SELECT
memset.end - memset.start AS "Duration:dur_ns",
memset.start AS "Start:ts_ns",
mk.label AS "Memory Kind",
memset.bytes AS "Bytes:mem_B",
(memset.globalPid >> 24) & 0x00FFFFFF AS "PID",
memset.deviceId AS "Device ID",
memset.contextId AS "Context ID",
memset.streamId AS "Stream ID",
sid.value AS "API Name",
memset.globalPid AS "_Global ID",
'cuda' AS "_API"
FROM
memset
JOIN
sid
ON sid.id == runtime.nameId
JOIN
main.CUPTI_ACTIVITY_KIND_RUNTIME AS runtime
ON runtime.correlationId == memset.correlationId
LEFT JOIN
ENUM_CUDA_MEM_KIND AS mk
ON memKind == mk.id
ORDER BY
1 DESC
LIMIT {ROW_LIMIT}
"""
table_checks = {
'CUPTI_ACTIVITY_KIND_RUNTIME':
"{DBFILE} could not be analyzed because it does not contain the required CUDA data."
" Does the application use CUDA runtime APIs?",
'CUPTI_ACTIVITY_KIND_MEMSET':
"{DBFILE} could not be analyzed because it does not contain the required CUDA data."
" Does the application use CUDA memset APIs?",
'ENUM_CUDA_MEM_KIND':
"{DBFILE} does not contain ENUM_CUDA_MEM_KIND table."
}
def setup(self):
err = super().setup()
if err != None:
return err
self.query = self.query_sync_memset.format(ROW_LIMIT = self._row_limit)
if __name__ == "__main__":
SyncMemset.Main()
|