1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146
|
/*========================== begin_copyright_notice ============================
Copyright (C) 2019-2021 Intel Corporation
SPDX-License-Identifier: MIT
============================= end_copyright_notice ===========================*/
//===----------------------------------------------------------------------===//
///
/// LSC L1$ hit rates are important for raytracing throughput. We want to
/// minimize the amount of live (spilled) data between the various
/// continuations of a shader (useful even if we set caching controls to skip
/// L1 for the spill data).
///
/// This is meant as a very simple first pass prior to shader splitting where
/// we rematerialize values closer to their uses such that th live range
/// doesn't cross a TraceRay() call.
///
//===----------------------------------------------------------------------===//
#include "RTBuilder.h"
#include "Compiler/IGCPassSupport.h"
#include "iStdLib/utility.h"
#include "common/LLVMUtils.h"
#include "common/LLVMWarningsPush.hpp"
#include <llvm/IR/InstIterator.h>
#include "common/LLVMWarningsPop.hpp"
using namespace llvm;
using namespace IGC;
class EarlyRematPass : public FunctionPass
{
public:
EarlyRematPass() : FunctionPass(ID)
{
initializeEarlyRematPassPass(*PassRegistry::getPassRegistry());
}
bool runOnFunction(Function &F) override;
StringRef getPassName() const override
{
return "EarlyRematPass";
}
void getAnalysisUsage(llvm::AnalysisUsage &AU) const override
{
AU.setPreservesCFG();
}
static char ID;
private:
bool Changed;
};
char EarlyRematPass::ID = 0;
// Register pass to igc-opt
#define PASS_FLAG "early-remat"
#define PASS_DESCRIPTION "Do simple remats prior to shader splitting"
#define PASS_CFG_ONLY false
#define PASS_ANALYSIS false
IGC_INITIALIZE_PASS_BEGIN(EarlyRematPass, PASS_FLAG, PASS_DESCRIPTION, PASS_CFG_ONLY, PASS_ANALYSIS)
IGC_INITIALIZE_PASS_END(EarlyRematPass, PASS_FLAG, PASS_DESCRIPTION, PASS_CFG_ONLY, PASS_ANALYSIS)
bool EarlyRematPass::runOnFunction(Function &F)
{
Changed = false;
SmallVector<Instruction*, 8> RematInsts;
for (auto& I : instructions(F))
{
auto* GII = dyn_cast<GenIntrinsicInst>(&I);
if (!GII)
continue;
// These values are all read either directly from the global pointer
// or through indirection to RTStack data via the global pointer.
// It's clear that these will reduce spills as the entire intrinsic
// can be recomputed in the continuation without needing to spill
// any of the operands.
//
// TODO: probably want to grow this with additional cheap nomem
// operations.
switch (GII->getIntrinsicID())
{
case GenISAIntrinsic::GenISA_DispatchRayIndex:
case GenISAIntrinsic::GenISA_RuntimeValue:
case GenISAIntrinsic::GenISA_DispatchDimensions:
case GenISAIntrinsic::GenISA_GlobalRootSignatureValue:
// This is usually profitable but might want to do more analysis
// here if we see many cases where it doesn't pan out.
case GenISAIntrinsic::GenISA_LocalRootSignatureValue:
RematInsts.push_back(GII);
break;
default:
break;
}
}
for (auto* I : RematInsts)
{
SmallVector<Use*, 4> Uses;
for (auto& U : I->uses())
Uses.push_back(&U);
for (auto *U : Uses)
{
auto *User = cast<Instruction>(U->getUser());
Instruction* IP = nullptr;
if (auto *PN = dyn_cast<PHINode>(User))
{
auto *InBB = PN->getIncomingBlock(*U);
IP = InBB->getTerminator();
}
else
{
IP = User;
}
auto* NewI = I->clone();
NewI->insertBefore(IP);
U->set(NewI);
Changed = true;
}
if (I->use_empty())
I->eraseFromParent();
}
return Changed;
}
namespace IGC
{
Pass* createEarlyRematPass(void)
{
return new EarlyRematPass();
}
} // namespace IGC
|