1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156
|
/*========================== begin_copyright_notice ============================
Copyright (C) 2018-2025 Intel Corporation
SPDX-License-Identifier: MIT
============================= end_copyright_notice ===========================*/
//===----------------------------------------------------------------------===//
///
/// The main pipeline of raytracing passes. This will be run by all APIs
/// (currently DX and Vulkan). The idea is to not put anything too API
/// specific here if possible.
///
//===----------------------------------------------------------------------===//
#include "IGC/common/StringMacros.hpp"
#include "AdaptorCommon/RayTracing/RayTracingInterface.h"
#include "AdaptorCommon/RayTracing/RayTracingPasses.hpp"
#include "AdaptorCommon/RayTracing/RTBuilder.h"
#include "AdaptorCommon/RayTracing/RayTracingAddressSpaceAliasAnalysis.h"
#include "AdaptorCommon/AddImplicitArgs.hpp"
#include "AdaptorCommon/ProcessFuncAttributes.h"
#include "AdaptorOCL/MoveStaticAllocas.h"
#include "Compiler/CISACodeGen/CodeSinking.hpp"
#include "Compiler/CISACodeGen/helper.h"
#include "Compiler/Optimizer/OpenCLPasses/PrivateMemory/PrivateMemoryUsageAnalysis.hpp"
#include "Compiler/Optimizer/OpenCLPasses/BreakConstantExpr/BreakConstantExpr.hpp"
#include "Compiler/Optimizer/OpenCLPasses/OpenCLPrintf/OpenCLPrintfAnalysis.hpp"
#include "Compiler/Optimizer/OpenCLPasses/OpenCLPrintf/OpenCLPrintfResolution.hpp"
#include "Compiler/Optimizer/OpenCLPasses/Atomics/ResolveOCLAtomics.hpp"
#include "Compiler/CustomSafeOptPass.hpp"
#include "IGC/common/LLVMUtils.h"
#include "common/LLVMWarningsPush.hpp"
#include <llvm/CodeGen/Passes.h>
#include <llvm/IR/Verifier.h>
#include <llvm/Transforms/IPO.h>
#include <llvm/Transforms/IPO/AlwaysInliner.h>
#include <llvm/Transforms/Scalar.h>
#include <llvm/Transforms/Utils.h>
#include <llvm/Analysis/AliasAnalysis.h>
#include "common/LLVMWarningsPop.hpp"
using namespace llvm;
using namespace IGC;
// This must be run prior to any raytracing passes so the BTI slots are
// allocated for them to use.
static void setupRegionBTIs(CodeGenContext *pContext) {
if (!pContext->m_DriverInfo.supportsRaytracingStatefulAccesses())
return;
SIMDMode Mode = SIMDMode::UNKNOWN;
if (auto SIMDSize = pContext->knownSIMDSize())
Mode = *SIMDSize;
// We need to take advantage of the LSC messages to do bindless through
// the surface state heap.
if (!pContext->platform.LSCEnabled(Mode))
return;
auto getAddrspace = [&]() {
BufferType BufType = IGC_IS_FLAG_ENABLED(DisableRTBindlessAccess) ? IGC::UAV : IGC::SSH_BINDLESS;
// There's nothing special about using UndefValue here. We just need
// to encode the address space as indirect.
return EncodeAS4GFXResource(*UndefValue::get(Type::getInt32Ty(*pContext->getLLVMContext())), BufType,
pContext->getUniqueIndirectIdx());
};
auto &rtInfo = pContext->getModuleMetaData()->rtInfo;
// bump the index once in case the index is already in use
pContext->getUniqueIndirectIdx();
uint32_t BaseOffset = 0;
if (pContext->type == ShaderType::RAYTRACING_SHADER) {
auto disableStatefulSWStack = false;
if (IGC_IS_FLAG_DISABLED(DisableStatefulRTStackAccess)) {
rtInfo.RTAsyncStackAddrspace = getAddrspace();
rtInfo.RTAsyncStackSurfaceStateOffset = BaseOffset++;
}
if (IGC_IS_FLAG_DISABLED(DisableStatefulSWHotZoneAccess)) {
rtInfo.SWHotZoneAddrspace = getAddrspace();
rtInfo.SWHotZoneSurfaceStateOffset = BaseOffset++;
}
if (IGC_IS_FLAG_DISABLED(DisableStatefulSWStackAccess) && !disableStatefulSWStack) {
rtInfo.SWStackAddrspace = getAddrspace();
rtInfo.SWStackSurfaceStateOffset = BaseOffset++;
}
if (IGC_IS_FLAG_DISABLED(DisableStatefulRTSyncStackAccess4RTShader)) {
rtInfo.RTSyncStackAddrspace = getAddrspace();
rtInfo.RTSyncStackSurfaceStateOffset = BaseOffset++;
}
} else {
if (IGC_IS_FLAG_DISABLED(DisableStatefulRTSyncStackAccess4nonRTShader)) {
rtInfo.RTSyncStackAddrspace = getAddrspace();
rtInfo.RTSyncStackSurfaceStateOffset = BaseOffset++;
}
}
}
static void setupRTMemoryStyle(CodeGenContext *pContext) {
auto &rtInfo = pContext->getModuleMetaData()->rtInfo;
rtInfo.MemStyle = RTMemoryStyle::Xe;
if (pContext->bvhInfo.uses64Bit) {
rtInfo.MemStyle = RTMemoryStyle::Xe3;
}
}
namespace IGC {
void RayTracingInlineLowering(CodeGenContext *pContext) {
IGCPassManager mpm(pContext, "RayTracingInlineLowering");
setupRegionBTIs(pContext);
setupRTMemoryStyle(pContext);
mpm.add(new CodeGenContextWrapper(pContext));
if (IGC_IS_FLAG_ENABLED(OverrideTMax))
mpm.add(createOverrideTMaxPass(IGC_GET_FLAG_VALUE(OverrideTMax)));
if (pContext->platform.enableRayQueryThrottling(pContext->getModuleMetaData()->compOpt.EnableDynamicRQManagement)) {
if (!pContext->m_DriverInfo.UseNewTraceRayInlineLoweringInNonRaytracingShaders())
mpm.add(CreateDynamicRayManagementPass());
}
if (!pContext->m_DriverInfo.UseNewTraceRayInlineLoweringInNonRaytracingShaders())
mpm.add(createTraceRayInlinePrepPass());
if (IGC_IS_FLAG_ENABLED(EnableRQHideLatency) &&
!pContext->m_DriverInfo.UseNewTraceRayInlineLoweringInNonRaytracingShaders()) {
mpm.add(createTraceRayInlineLatencySchedulerPass());
mpm.add(createCFGSimplificationPass());
}
if (!pContext->m_DriverInfo.UseNewTraceRayInlineLoweringInNonRaytracingShaders())
mpm.add(CreateTraceRayInlineLoweringPass());
else
mpm.add(createInlineRaytracing());
mpm.add(CreateRTGlobalsPointerLoweringPass());
#ifdef _DEBUG
// Run verification after generating continuation functions to ensure
// that we still have well formed IR
mpm.add(createVerifierPass(false));
#endif // _DEBUG
mpm.run(*pContext->getModule());
}
} // namespace IGC
|