1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150
|
/*========================== begin_copyright_notice ============================
Copyright (C) 2019-2021 Intel Corporation
SPDX-License-Identifier: MIT
============================= end_copyright_notice ===========================*/
//===----------------------------------------------------------------------===//
///
/// Prior to this pass, all allocas will be alloated in the global address
/// space. For example:
///
/// %1 = alloca i32, addrspace(1)
///
/// The intrinsic lowering pass (which runs after this pass) will examine all
/// addrspace(1) allocas and reserve space for them in the RTStack. This is
/// useful for ray payload data that would be updated by another shader. For
/// example:
///
/// struct RayPayload
/// {
/// float4 color;
/// };
///
/// [shader("raygeneration")]
/// void MyRaygenShader()
/// {
/// RayPayload payload = { float4(0, 0, 0, 0) };
/// TraceRay(..., payload);
/// }
///
/// [shader("miss")]
/// void MyMissShader(inout RayPayload payload)
/// {
/// payload.color = float4(0, 0, 0, 1);
/// }
///
/// The payload parameter passed to MyMissShader is by reference so appears
/// as a pointer in the IR. We spill this pointer onto the RTStack in the
/// raygen shader so it can be read in the miss shader. The 16 bytes
/// representing the RayPayload will also be spilled on the stack so that
/// memory is still live when the miss shader writes to it. That must stay
/// on the RTStack.
///
/// In contrast, an allocation used privately within a shader doesn't require
/// its lifetime to extend to the execution of other shaders so the above alloca
/// would become:
///
/// %1 = alloca i32
///
/// Intrinsic lowering will skip this and allow PrivateMemoryResolution to do
/// its thing.
//===----------------------------------------------------------------------===//
#include "RTBuilder.h"
#include "AllocaTracking.h"
#include "Compiler/IGCPassSupport.h"
#include "iStdLib/utility.h"
#include "common/LLVMUtils.h"
#include "common/LLVMWarningsPush.hpp"
#include <llvm/IR/InstVisitor.h>
#include <llvm/ADT/Statistic.h>
#include <llvm/Support/DebugCounter.h>
#include "common/LLVMWarningsPop.hpp"
#include "Probe/Assertion.h"
using namespace llvm;
using namespace IGC;
using namespace AllocaTracking;
#define DEBUG_TYPE "promote-to-scratch"
STATISTIC(NumAllocaPromoted, "Number of allocas promoted");
DEBUG_COUNTER(AllocaPromoteCounter, "promote-to-scratch-promote",
"Controls number of promoted allocas");
class PromoteToScratchPass : public FunctionPass, public InstVisitor<PromoteToScratchPass>
{
public:
PromoteToScratchPass() : FunctionPass(ID)
{
initializePromoteToScratchPassPass(*PassRegistry::getPassRegistry());
}
bool runOnFunction(Function &F) override;
StringRef getPassName() const override
{
return "PromoteToScratchPass";
}
void getAnalysisUsage(llvm::AnalysisUsage &AU) const override
{
AU.setPreservesCFG();
}
void visitAllocaInst(AllocaInst& AI);
static char ID;
private:
bool Changed;
};
char PromoteToScratchPass::ID = 0;
// Register pass to igc-opt
#define PASS_FLAG "promote-to-scratch"
#define PASS_DESCRIPTION "Convert global allocas into private allocas"
#define PASS_CFG_ONLY false
#define PASS_ANALYSIS false
IGC_INITIALIZE_PASS_BEGIN(PromoteToScratchPass, PASS_FLAG, PASS_DESCRIPTION, PASS_CFG_ONLY, PASS_ANALYSIS)
IGC_INITIALIZE_PASS_END(PromoteToScratchPass, PASS_FLAG, PASS_DESCRIPTION, PASS_CFG_ONLY, PASS_ANALYSIS)
bool PromoteToScratchPass::runOnFunction(Function &F)
{
Changed = false;
visit(F);
return Changed;
}
void PromoteToScratchPass::visitAllocaInst(AllocaInst& AI)
{
if (!DebugCounter::shouldExecute(AllocaPromoteCounter))
return;
// For now, we use a simple filtering approach which can be expanded if
// we see missing optimization opportunities. If all users are supported,
// we can just mutate the type in place for each of the values.
IGC_ASSERT_MESSAGE(RTBuilder::isNonLocalAlloca(&AI), "Should still be global!");
SmallVector<Instruction*, 4> Insts;
DenseSet<CallInst*> DeferredInsts;
if (processAlloca(&AI, false, Insts, DeferredInsts))
{
Changed = true;
NumAllocaPromoted++;
rewriteTypes(ADDRESS_SPACE_PRIVATE, Insts, DeferredInsts);
}
}
namespace IGC
{
Pass* createPromoteToScratchPass(void)
{
return new PromoteToScratchPass();
}
} // namespace IGC
|