| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
 1000
 1001
 1002
 1003
 1004
 1005
 1006
 1007
 1008
 1009
 1010
 1011
 1012
 1013
 1014
 1015
 1016
 1017
 1018
 1019
 1020
 1021
 1022
 1023
 1024
 1025
 1026
 1027
 1028
 1029
 1030
 1031
 1032
 1033
 1034
 1035
 1036
 1037
 1038
 1039
 1040
 1041
 1042
 1043
 1044
 1045
 1046
 1047
 1048
 1049
 1050
 1051
 1052
 1053
 1054
 1055
 1056
 1057
 1058
 1059
 1060
 1061
 1062
 1063
 1064
 1065
 1066
 1067
 1068
 1069
 1070
 1071
 1072
 1073
 1074
 1075
 1076
 1077
 1078
 1079
 1080
 1081
 1082
 1083
 1084
 1085
 1086
 1087
 1088
 1089
 1090
 1091
 1092
 1093
 1094
 1095
 1096
 1097
 1098
 1099
 1100
 1101
 1102
 1103
 1104
 1105
 1106
 1107
 1108
 1109
 1110
 1111
 1112
 1113
 1114
 1115
 1116
 1117
 1118
 1119
 1120
 1121
 1122
 1123
 1124
 1125
 
 | ====================================
JITLink and ORC's ObjectLinkingLayer
====================================
.. contents::
   :local:
Introduction
============
This document aims to provide a high-level overview of the design and API
of the JITLink library. It assumes some familiarity with linking and
relocatable object files, but should not require deep expertise. If you know
what a section, symbol, and relocation are you should find this document
accessible. If it is not, please submit a patch (:doc:`Contributing`) or file a
bug (:doc:`HowToSubmitABug`).
JITLink is a library for :ref:`jit_linking`. It was built to support the ORC JIT
APIs and is most commonly accessed via ORC's ObjectLinkingLayer API. JITLink was
developed with the aim of supporting the full set of features provided by each
object format; including static initializers, exception handling, thread local
variables, and language runtime registration. Supporting these features enables
ORC to execute code generated from source languages which rely on these features
(e.g. C++ requires object format support for static initializers to support
static constructors, eh-frame registration for exceptions, and TLV support for
thread locals; Swift and Objective-C require language runtime registration for
many features). For some object format features support is provided entirely
within JITLink, and for others it is provided in cooperation with the
(prototype) ORC runtime.
JITLink aims to support the following features, some of which are still under
development:
1. Cross-process and cross-architecture linking of single relocatable objects
   into a target *executor* process.
2. Support for all object format features.
3. Open linker data structures (``LinkGraph``) and pass system.
JITLink and ObjectLinkingLayer
==============================
``ObjectLinkingLayer`` is ORCs wrapper for JITLink. It is an ORC layer that
allows objects to be added to a ``JITDylib``, or emitted from some higher level
program representation. When an object is emitted, ``ObjectLinkingLayer`` uses
JITLink to construct a ``LinkGraph`` (see :ref:`constructing_linkgraphs`) and
calls JITLink's ``link`` function to link the graph into the executor process.
The ``ObjectLinkingLayer`` class provides a plugin API,
``ObjectLinkingLayer::Plugin``, which users can subclass in order to inspect and
modify ``LinkGraph`` instances at link time, and react to important JIT events
(such as an object being emitted into target memory). This enables many features
and optimizations that were not possible under MCJIT or RuntimeDyld.
ObjectLinkingLayer Plugins
--------------------------
The ``ObjectLinkingLayer::Plugin`` class  provides the following  methods:
* ``modifyPassConfig`` is called each time a LinkGraph is about to be linked. It
  can be overridden to install JITLink *Passes* to run during the link process.
  .. code-block:: c++
    void modifyPassConfig(MaterializationResponsibility &MR,
                          const Triple &TT,
                          jitlink::PassConfiguration &Config)
* ``notifyLoaded`` is called before the link begins, and can be overridden to
  set up any initial state for the given ``MaterializationResponsibility`` if
  needed.
  .. code-block:: c++
    void notifyLoaded(MaterializationResponsibility &MR)
* ``notifyEmitted`` is called after the link is complete and code has been
  emitted to the executor process. It can be overridden to finalize state
  for the ``MaterializationResponsibility`` if needed.
  .. code-block:: c++
    Error notifyEmitted(MaterializationResponsibility &MR)
* ``notifyFailed`` is called if the link fails at any point. It can be
  overridden to react to the failure (e.g. to deallocate any already allocated
  resources).
  .. code-block:: c++
    Error notifyFailed(MaterializationResponsibility &MR)
* ``notifyRemovingResources`` is called when a request is made to remove any
  resources associated with the ``ResourceKey`` *K* for the
  ``MaterializationResponsibility``.
  .. code-block:: c++
    Error notifyRemovingResources(ResourceKey K)
* ``notifyTransferringResources`` is called if/when a request is made to
  transfer tracking of any resources associated with ``ResourceKey``
  *SrcKey* to *DstKey*.
  .. code-block:: c++
    void notifyTransferringResources(ResourceKey DstKey,
                                     ResourceKey SrcKey)
Plugin authors are required to implement the ``notifyFailed``,
``notifyRemovingResources``, and ``notifyTransferringResources`` methods in
order to safely manage resources in the case of resource removal or transfer,
or link failure. If no resources are managed by the plugin then these methods
can be implemented as no-ops returning ``Error::success()``.
Plugin instances are added to an ``ObjectLinkingLayer`` by
calling the ``addPlugin`` method [1]_. E.g.
.. code-block:: c++
  // Plugin class to print the set of defined symbols in an object when that
  // object is linked.
  class MyPlugin : public ObjectLinkingLayer::Plugin {
  public:
    // Add passes to print the set of defined symbols after dead-stripping.
    void modifyPassConfig(MaterializationResponsibility &MR,
                          const Triple &TT,
                          jitlink::PassConfiguration &Config) override {
      Config.PostPrunePasses.push_back([this](jitlink::LinkGraph &G) {
        return printAllSymbols(G);
      });
    }
    // Implement mandatory overrides:
    Error notifyFailed(MaterializationResponsibility &MR) override {
      return Error::success();
    }
    Error notifyRemovingResources(ResourceKey K) override {
      return Error::success();
    }
    void notifyTransferringResources(ResourceKey DstKey,
                                     ResourceKey SrcKey) override {}
    // JITLink pass to print all defined symbols in G.
    Error printAllSymbols(LinkGraph &G) {
      for (auto *Sym : G.defined_symbols())
        if (Sym->hasName())
          dbgs() << Sym->getName() << "\n";
      return Error::success();
    }
  };
  // Create our LLJIT instance using a custom object linking layer setup.
  // This gives us a chance to install our plugin.
  auto J = ExitOnErr(LLJITBuilder()
             .setObjectLinkingLayerCreator(
               [](ExecutionSession &ES, const Triple &T) {
                 // Manually set up the ObjectLinkingLayer for our LLJIT
                 // instance.
                 auto OLL = std::make_unique<ObjectLinkingLayer>(
                     ES, std::make_unique<jitlink::InProcessMemoryManager>());
                 // Install our plugin:
                 OLL->addPlugin(std::make_unique<MyPlugin>());
                 return OLL;
               })
             .create());
  // Add an object to the JIT. Nothing happens here: linking isn't triggered
  // until we look up some symbol in our object.
  ExitOnErr(J->addObject(loadFromDisk("main.o")));
  // Plugin triggers here when our lookup of main triggers linking of main.o
  auto MainSym = J->lookup("main");
LinkGraph
=========
JITLink maps all relocatable object formats to a generic ``LinkGraph`` type
that is designed to make linking fast and easy (``LinkGraph`` instances can
also be created manually. See :ref:`constructing_linkgraphs`).
Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details,
but share a common goal: to represent machine level code and data with
annotations that allow them to be relocated in a virtual address space. To
this end they usually contain names (symbols) for content defined inside the
file or externally, chunks of content that must be moved as a unit (sections
or subsections, depending on the format), and annotations describing how to
patch content based on the final address of some target symbol/section
(relocations).
At a high level, the ``LinkGraph`` type represents these concepts as a decorated
graph. Nodes in the graph represent symbols and content, and edges represent
relocations. Each of the elements of the graph is listed here:
* ``Addressable`` -- A node in the link graph that can be assigned an address
  in the executor process's virtual address space.
  Absolute and external symbols are represented using plain ``Addressable``
  instances. Content defined inside the object file is represented using the
  ``Block`` subclass.
* ``Block`` -- An ``Addressable`` node that has ``Content`` (or is marked as
  zero-filled), a parent ``Section``, a ``Size``, an ``Alignment`` (and an
  ``AlignmentOffset``), and a list of ``Edge`` instances.
  Blocks provide a container for binary content which must remain contiguous in
  the target address space (a *layout unit*). Many interesting low level
  operations on ``LinkGraph`` instances involve inspecting or mutating block
  content or edges.
  * ``Content`` is represented as an ``llvm::StringRef``, and accessible via
    the ``getContent`` method. Content is only available for content blocks,
    and not for zero-fill blocks (use ``isZeroFill`` to check, and prefer
    ``getSize`` when only the block size is needed as it works for both
    zero-fill and content blocks).
  * ``Section`` is represented as a ``Section&`` reference, and accessible via
    the ``getSection`` method. The ``Section`` class is described in more detail
    below.
  * ``Size`` is represented as a ``size_t``, and is accessible via the
    ``getSize`` method for both content and zero-filled blocks.
  * ``Alignment`` is represented as a ``uint64_t``, and available via the
    ``getAlignment`` method. It represents the minimum alignment requirement (in
    bytes) of the start of the block.
  * ``AlignmentOffset`` is represented as a ``uint64_t``, and accessible via the
    ``getAlignmentOffset`` method. It represents the offset from the alignment
    required for the start of the block. This is required to support blocks
    whose minimum alignment requirement comes from data at some non-zero offset
    inside the block. E.g. if a block consists of a single byte (with byte
    alignment) followed by a uint64_t (with 8-byte alignment), then the block
    will have 8-byte alignment with an alignment offset of 7.
  * list of ``Edge`` instances. An iterator range for this list is returned by
    the ``edges`` method. The ``Edge`` class is described in more detail below.
* ``Symbol`` -- An offset from an ``Addressable`` (often a ``Block``), with an
  optional ``Name``, a ``Linkage``, a ``Scope``, a ``Callable`` flag, and a
  ``Live`` flag.
  Symbols make it possible to name content (blocks and addressables are
  anonymous), or target content with an ``Edge``.
  * ``Name`` is represented as an ``llvm::StringRef`` (equal to
    ``llvm::StringRef()`` if the symbol has no name), and accessible via the
    ``getName`` method.
  * ``Linkage`` is one of *Strong* or *Weak*, and is accessible via the
    ``getLinkage`` method. The ``JITLinkContext`` can use this flag to determine
    whether this symbol definition should be kept or dropped.
  * ``Scope`` is one of *Default*, *Hidden*, or *Local*, and is accessible via
    the ``getScope`` method. The ``JITLinkContext`` can use this to determine
    who should be able to see the symbol. A symbol with default scope should be
    globally visible. A symbol with hidden scope should be visible to other
    definitions within the same simulated dylib (e.g. ORC ``JITDylib``) or
    executable, but not from elsewhere. A symbol with local scope should only be
    visible within the current ``LinkGraph``.
  * ``Callable`` is a boolean which is set to true if this symbol can be called,
    and is accessible via the ``isCallable`` method. This can be used to
    automate the introduction of call-stubs for lazy compilation.
  * ``Live`` is a boolean that can be set to mark this symbol as root for
    dead-stripping purposes (see :ref:`generic_link_algorithm`). JITLink's
    dead-stripping algorithm will propagate liveness flags through the graph to
    all reachable symbols before deleting any symbols (and blocks) that are not
    marked live.
* ``Edge`` -- A quad of an ``Offset`` (implicitly from the start of the
  containing ``Block``), a ``Kind`` (describing the relocation type), a
  ``Target``, and an ``Addend``.
  Edges represent relocations, and occasionally other relationships, between
  blocks and symbols.
  * ``Offset``, accessible via ``getOffset``, is an offset from the start of the
    ``Block`` containing the ``Edge``.
  * ``Kind``, accessible via ``getKind`` is a relocation type -- it describes
    what kinds of changes (if any) should be made to block content at the given
    ``Offset`` based on the address of the ``Target``.
  * ``Target``, accessible via ``getTarget``, is a pointer to a ``Symbol``,
    representing whose address is relevant to the fixup calculation specified by
    the edge's ``Kind``.
  * ``Addend``, accessible via ``getAddend``, is a constant whose interpretation
    is determined by the edge's ``Kind``.
* ``Section`` -- A set of ``Symbol`` instances, plus a set of ``Block``
  instances, with a ``Name``, a set of ``ProtectionFlags``, and an ``Ordinal``.
  Sections make it easy to iterate over the symbols or blocks associated with
  a particular section in the source object file.
  * ``blocks()`` returns an iterator over the set of blocks defined in the
    section (as ``Block*`` pointers).
  * ``symbols()`` returns an iterator over the set of symbols defined in the
    section (as ``Symbol*`` pointers).
  * ``Name`` is represented as an ``llvm::StringRef``, and is accessible via the
    ``getName`` method.
  * ``ProtectionFlags`` are represented as a sys::Memory::ProtectionFlags enum,
    and accessible via the ``getProtectionFlags`` method. These flags describe
    whether the section is readable, writable, executable, or some combination
    of these. The most common combinations are ``RW-`` for writable data,
    ``R--`` for constant data, and ``R-X`` for code.
  * ``SectionOrdinal``, accessible via ``getOrdinal``, is a number used to order
    the section relative to others.  It is usually used to preserve section
    order within a segment (a set of sections with the same memory protections)
    when laying out memory.
For the graph-theorists: The ``LinkGraph`` is bipartite, with one set of
``Symbol`` nodes and one set of ``Addressable`` nodes. Each ``Symbol`` node has
one (implicit) edge to its target ``Addressable``. Each ``Block`` has a set of
edges (possibly empty, represented as ``Edge`` instances) back to elements of
the ``Symbol`` set. For convenience and performance of common algorithms,
symbols and blocks are further grouped into ``Sections``.
The ``LinkGraph`` itself provides operations for constructing, removing, and
iterating over sections, symbols, and blocks. It also provides metadata
and utilities relevant to the linking process:
* Graph element operations
  * ``sections`` returns an iterator over all sections in the graph.
  * ``findSectionByName`` returns a pointer to the section with the given
    name (as a ``Section*``) if it exists, otherwise returns a nullptr.
  * ``blocks`` returns an iterator over all blocks in the graph (across all
    sections).
  * ``defined_symbols`` returns an iterator over all defined symbols in the
    graph (across all sections).
  * ``external_symbols`` returns an iterator over all external symbols in the
    graph.
  * ``absolute_symbols`` returns an iterator over all absolute symbols in the
    graph.
  * ``createSection`` creates a section with a given name and protection flags.
  * ``createContentBlock`` creates a block with the given initial content,
    parent section, address, alignment, and alignment offset.
  * ``createZeroFillBlock`` creates a zero-fill block with the given size,
    parent section, address, alignment, and alignment offset.
  * ``addExternalSymbol`` creates a new addressable and symbol with a given
    name, size, and linkage.
  * ``addAbsoluteSymbol`` creates a new addressable and symbol with a given
    name, address, size, linkage, scope, and liveness.
  * ``addCommonSymbol`` convenience function for creating a zero-filled block
    and weak symbol with a given name, scope, section, initial address, size,
    alignment and liveness.
  * ``addAnonymousSymbol`` creates a new anonymous symbol for a given block,
    offset, size, callable-ness, and liveness.
  * ``addDefinedSymbol`` creates a new symbol for a given block with a name,
    offset, size, linkage, scope, callable-ness and liveness.
  * ``makeExternal`` transforms a formerly defined symbol into an external one
    by creating a new addressable and pointing the symbol at it. The existing
    block is not deleted, but can be manually removed (if unreferenced) by
    calling ``removeBlock``. All edges to the symbol remain valid, but the
    symbol must now be defined outside this ``LinkGraph``.
  * ``removeExternalSymbol`` removes an external symbol and its target
    addressable. The target addressable must not be referenced by any other
    symbols.
  * ``removeAbsoluteSymbol`` removes an absolute symbol and its target
    addressable. The target addressable must not be referenced by any other
    symbols.
  * ``removeDefinedSymbol`` removes a defined symbol, but *does not* remove
    its target block.
  * ``removeBlock`` removes the given block.
  * ``splitBlock`` split a given block in two at a given index (useful where
    it is known that a block contains decomposable records, e.g. CFI records
    in an eh-frame section).
* Graph utility operations
  * ``getName`` returns the name of this graph, which is usually based on the
    name of the input object file.
  * ``getTargetTriple`` returns an `llvm::Triple` for the executor process.
  * ``getPointerSize`` returns the size of a pointer (in bytes) in the executor
    process.
  * ``getEndinaness`` returns the endianness of the executor process.
  * ``allocateString`` copies data from a given ``llvm::Twine`` into the
    link graph's internal allocator. This can be used to ensure that content
    created inside a pass outlives that pass's execution.
.. _generic_link_algorithm:
Generic Link Algorithm
======================
JITLink provides a generic link algorithm which can be extended / modified at
certain points by the introduction of JITLink :ref:`passes`:
#. Phase 1
   This phase is called immediately by the ``link`` function as soon as the
   initial configuration (including the pass pipeline setup) is complete.
   #. Run pre-prune passes.
      These passes are called on the graph before it is pruned. At this stage
      ``LinkGraph`` nodes still have their original vmaddrs. A mark-live pass
      (supplied by the ``JITLinkContext``) will be run at the end of this
      sequence to mark the initial set of live symbols.
      Notable use cases: marking nodes live, accessing/copying graph data that
      will be pruned (e.g. metadata that's important for the JIT, but not needed
      for the link process).
   #. Prune (dead-strip) the ``LinkGraph``.
      Removes all symbols and blocks not reachable from the initial set of live
      symbols.
      This allows JITLink to remove unreachable symbols / content, including
      overridden weak and redundant ODR definitions.
   #. Run post-prune passes.
      These passes are run on the graph after dead-stripping, but before memory
      is allocated or nodes assigned their final target vmaddrs.
      Passes run at this stage benefit from pruning, as dead functions and data
      have been stripped from the graph. However new content can still be added
      to the graph, as target and working memory have not been allocated yet.
      Notable use cases: Building Global Offset Table (GOT), Procedure Linkage
      Table (PLT), and Thread Local Variable (TLV) entries.
   #. Sort blocks into segments.
      Sorts all blocks by ordinal and then address. Collects sections with
      matching permissions into segments and computes the size of these
      segments for memory allocation.
   #. Allocate segment memory, update node addresses.
      Calls the ``JITLinkContext``'s ``JITLinkMemoryManager`` to allocate both
      working and target memory for the graph, then updates all node addresses
      to their assigned target address.
      Note: This step only updates the addresses of nodes defined in this graph.
      External symbols will still have null addresses.
   #. Run post-allocation passes.
      These passes are run on the graph after working and target memory have
      been allocated, but before the ``JITLinkContext`` is notified of the
      final addresses of the symbols in the graph. This gives these passes a
      chance to set up data structures associated with target addresses before
      any JITLink clients (especially ORC queries for symbol resolution) can
      attempt to access them.
      Notable use cases: Setting up mappings between target addresses and
      JIT data structures, such as a mapping between ``__dso_handle`` and
      ``JITDylib*``.
   #. Notify the ``JITLinkContext`` of the assigned symbol addresses.
      Calls ``JITLinkContext::notifyResolved`` on the link graph, allowing
      clients to react to the symbol address assignments made for this graph.
      In ORC this is used to notify any pending queries for *resolved* symbols,
      including pending queries from concurrently running JITLink instances that
      have reached the next step and are waiting on the address of a symbol in
      this graph to proceed with their link.
   #. Identify external symbols and resolve their addresses asynchronously.
      Calls the ``JITLinkContext`` to resolve the target address of any external
      symbols in the graph. This step is asynchronous -- JITLink will pack the
      link state into a *continuation* to be run once the symbols are resolved.
      This is the final step of Phase 1.
#. Phase 2
   This phase is called by the continuation constructed at the end of the
   external symbol resolution step above.
   #. Apply external symbol resolution results.
      This updates the addresses of all external symbols. At this point all
      nodes in the graph have their final target addresses, however node
      content still points back to the original data in the object file.
   #. Run pre-fixup passes.
      These passes are called on the graph after all nodes have been assigned
      their final target addresses, but before node content is copied into
      working memory and fixed up. Passes run at this stage can make late
      optimizations to the graph and content based on address layout.
      Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses are
      bypassed for fixup targets that are directly accessible under the assigned
      memory layout.
   #. Copy block content to working memory and apply fixups.
      Copies all block content into allocated working memory (following the
      target layout) and applies fixups. Graph blocks are updated to point at
      the fixed up content.
   #. Run post-fixup passes.
      These passes are called on the graph after fixups have been applied and
      blocks updated to point to the fixed up content.
      Post-fixup passes can inspect blocks contents to see the exact bytes that
      will be copied to the assigned target addresses.
   #. Finalize memory asynchronously.
      Calls the ``JITLinkMemoryManager`` to copy working memory to the executor
      process and apply the requested permissions. This step is asynchronous --
      JITLink will pack the link state into a *continuation* to be run once
      memory has been copied and protected.
      This is the final step of Phase 2.
#. Phase 3.
   This phase is called by the continuation constructed at the end of the
   memory finalization step above.
   #. Notify the context that the graph has been emitted.
      Calls ``JITLinkContext::notifyFinalized`` and hands off the
      ``JITLinkMemoryManager::Allocation`` object for this graph's memory
      allocation. This allows the context to track/hold memory allocations and
      react to the newly emitted definitions. In ORC this is used to update the
      ``ExecutionSession`` instance's dependence graph, which may result in
      these symbols (and possibly others) becoming *Ready* if all of their
      dependencies have also been emitted.
.. _passes:
Passes
------
JITLink passes are ``std::function<Error(LinkGraph&)>`` instances. They are free
to inspect and modify the given ``LinkGraph`` subject to the constraints of
whatever phase they are running in (see :ref:`generic_link_algorithm`). If a
pass returns ``Error::success()`` then linking continues. If a pass returns
a failure value then linking is stopped and the ``JITLinkContext`` is notified
that the link failed.
Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOT
and PLT construction as a pass), and external clients like
``ObjectLinkingLayer::Plugin``.
In combination with the open ``LinkGraph`` API, JITLink passes enable the
implementation of powerful new features. For example:
* Relaxation optimizations -- A pre-fixup pass can inspect GOT accesses and PLT
  calls and identify situations where the addresses of the entry target and the
  access are close enough to be accessed directly. In this case the pass can
  rewrite the instruction stream of the containing block and update the fixup
  edges to make the access direct.
  Code for this looks like:
.. code-block:: c++
  Error relaxGOTEdges(LinkGraph &G) {
    for (auto *B : G.blocks())
      for (auto &E : B->edges())
        if (E.getKind() == x86_64::GOTLoad) {
          auto &GOTTarget = getGOTEntryTarget(E.getTarget());
          if (isInRange(B.getFixupAddress(E), GOTTarget)) {
            // Rewrite B.getContent() at fixup address from
            // MOVQ to LEAQ
            // Update edge target and kind.
            E.setTarget(GOTTarget);
            E.setKind(x86_64::PCRel32);
          }
        }
    return Error::success();
  }
* Metadata registration -- Post allocation passes can be used to record the
  address range of sections in the target. This can be used to register the
  metadata (e.g exception handling frames, language metadata) in the target
  once memory has been finalized.
.. code-block:: c++
  Error registerEHFrameSection(LinkGraph &G) {
    if (auto *Sec = G.findSectionByName("__eh_frame")) {
      SectionRange SR(*Sec);
      registerEHFrameSection(SR.getStart(), SR.getEnd());
    }
    return Error::success();
  }
* Record call sites for later mutation -- A post-allocation pass can record
  the call sites of all calls to a particular function, allowing those call
  sites to be updated later at runtime (e.g. for instrumentation, or to
  enable the function to be lazily compiled but still called directly after
  compilation).
.. code-block:: c++
  StringRef FunctionName = "foo";
  std::vector<JITTargetAddress> CallSitesForFunction;
  auto RecordCallSites =
    [&](LinkGraph &G) -> Error {
      for (auto *B : G.blocks())
        for (auto &E : B.edges())
          if (E.getKind() == CallEdgeKind &&
              E.getTarget().hasName() &&
              E.getTraget().getName() == FunctionName)
            CallSitesForFunction.push_back(B.getFixupAddress(E));
      return Error::success();
    };
Memory Management with JITLinkMemoryManager
-------------------------------------------
JIT linking requires allocation of two kinds of memory: working memory in the
JIT process and target memory in the execution process (these processes and
memory allocations may be one and the same, depending on how the user wants
to build their JIT). It also requires that these allocations conform to the
requested code model in the target process (e.g. MachO/x86-64's Small code
model requires that all code and data for a simulated dylib is allocated within
4Gb). Finally, it is natural to make the memory manager responsible for
transferring memory to the target address space and applying memory protections,
since the memory manager must know how to communicate with the executor, and
since sharing and protection assignment can often be efficiently managed (in
the common case of running across processes on the same machine for security)
via the host operating system's virtual memory management APIs.
To satisfy these requirements ``JITLinkMemoryManager`` adopts the following
design: The memory manager itself has just one virtual method that returns a
``JITLinkMemoryManager::Allocation``:
.. code-block:: c++
  virtual Expected<std::unique_ptr<Allocation>>
  allocate(const JITLinkDylib *JD, const SegmentsRequestMap &Request) = 0;
This method takes a ``JITLinkDylib*`` representing the target simulated
dylib, and the full set of sections that must be allocated for this object.
``JITLinkMemoryManager`` implementations can (optionally) use the ``JD``
argument to manage a per-simulated-dylib memory pool (since code model
constraints are typically imposed on a per-dylib basis, and not across
dylibs) [2]_. The ``Request`` argument, by describing all sections in the current
object up-front, allows the implementer to allocate those sections as a
single slab, either within a pre-allocated per-jitdylib pool or directly
from system memory.
All subsequent operations are provided by the
``JITLinkMemoryManager::Allocation`` interface:
* ``virtual MutableArrayRef<char> getWorkingMemory(ProtectionFlags Seg)``
  Should be overridden to return the address in working memory of the segment
  with the given protection flags.
* ``virtual JITTargetAddress getTargetMemory(ProtectionFlags Seg)``
  Should be overridden to return the address in the executor's address space of
  the segment with the given protection flags.
* ``virtual void finalizeAsync(FinalizeContinuation OnFinalize)``
  Should be overridden to copy the contents of working memory to the target
  address space and apply memory protections for all segments. Where working
  memory and target memory are separate, this method should deallocate the
  working memory.
* ``virtual Error deallocate()``
  Should be overridden to deallocate memory in the target address space.
JITLink provides a simple in-process implementation of this interface:
``InProcessMemoryManager``. It allocates pages once and re-uses them as both
working and target memory.
ORC provides a cross-process ``JITLinkMemoryManager`` based on an ORC-RPC-based
implementation of the ``orc::TargetProcessControl`` API:
``OrcRPCTPCJITLinkMemoryManager``. This API uses TargetProcessControl API calls
to allocate and manage memory in a remote process. The underlying communication
channel is determined by the ORC-RPC channel type. Common options include unix
sockets or TCP.
JITLinkMemoryManager and Security
---------------------------------
JITLink's ability to link JIT'd code for a separate executor process can be
used to improve the security of a JIT system: The executor process can be
sandboxed, run within a VM, or even run on a fully separate machine.
JITLink's memory manager interface is flexible enough to allow for a range of
trade-offs between performance and security. For example, on a system where code
pages must be signed (preventing code from being updated), the memory manager
can deallocate working memory pages after linking to free memory in the process
running JITLink. Alternatively, on a system that allows RWX pages, the memory
manager may use the same pages for both working and target memory by marking
them as RWX, allowing code to be modified in place without further overhead.
Finally, if RWX pages are not permitted but dual-virtual-mappings of
physical memory pages are, then the memory manager can dual map physical pages
as RW- in the JITLink process and R-X in the executor process, allowing
modification from the JITLink process but not from the executor (at the cost of
extra administrative overhead for the dual mapping).
Error Handling
--------------
JITLink makes extensive use of the ``llvm::Error`` type (see the error handling
section of :doc:`ProgrammersManual` for details). The link process itself, all
passes, the memory manager interface, and operations on the ``JITLinkContext``
are all permitted to fail. Link graph construction utilities (especially parsers
for object formats) are encouraged to validate input, and validate fixups
(e.g. with range checks) before application.
Any error will halt the link process and notify the context of failure. In ORC,
reported failures are propagated to queries pending on definitions provided by
the failing link, and also through edges of the dependence graph to any queries
waiting on dependent symbols.
.. _connection_to_orc_runtime:
Connection to the ORC Runtime
=============================
The ORC Runtime (currently under development) aims to provide runtime support
for advanced JIT features, including object format features that require
non-trivial action in the executor (e.g. running initializers, managing thread
local storage, registering with language runtimes, etc.).
ORC Runtime support for object format features typically requires cooperation
between the runtime (which executes in the executor process) and JITLink (which
runs in the JIT process and can inspect LinkGraphs to determine what actions
must be taken in the executor). For example: Execution of MachO static
initializers in the ORC runtime is performed by the ``jit_dlopen`` function,
which calls back to the JIT process to ask for the list of address ranges of
``__mod_init`` sections to walk. This list is collated by the
``MachOPlatformPlugin``, which installs a pass to record this information for
each object as it is linked into the target.
.. _constructing_linkgraphs:
Constructing LinkGraphs
=======================
Clients usually access and manipulate ``LinkGraph`` instances that were created
for them by an ``ObjectLinkingLayer`` instance, but they can be created manually:
#. By directly constructing and populating a ``LinkGraph`` instance.
#. By using the ``createLinkGraph`` family of functions to create a
   ``LinkGraph`` from an in-memory buffer containing an object file. This is how
   ``ObjectLinkingLayer`` usually creates ``LinkGraphs``.
  #. ``createLinkGraph_<Object-Format>_<Architecture>`` can be used when
      both the object format and architecture are known ahead of time.
  #. ``createLinkGraph_<Object-Format>`` can be used when the object format is
     known ahead of time, but the architecture is not. In this case the
     architecture will be determined by inspection of the object header.
  #. ``createLinkGraph`` can be used when neither the object format nor
     the architecture are known ahead of time. In this case the object header
     will be inspected to determine both the format and architecture.
.. _jit_linking:
JIT Linking
===========
The JIT linker concept was introduced in LLVM's earlier generation of JIT APIs,
MCJIT. In MCJIT the *RuntimeDyld* component enabled re-use of LLVM as an
in-memory compiler by adding an in-memory link step to the end of the usual
compiler pipeline. Rather than dumping relocatable objects to disk as a compiler
usually would, MCJIT passed them to RuntimeDyld to be linked into a target
process.
This approach to linking differs from standard *static* or *dynamic* linking:
A *static linker* takes one or more relocatable object files as input and links
them into an executable or dynamic library on disk.
A *dynamic linker* applies relocations to executables and dynamic libraries that
have been loaded into memory.
A *JIT linker* takes a single relocatable object file at a time and links it
into a target process, usually using a context object to allow the linked code
to resolve symbols in the target.
RuntimeDyld
-----------
In order to keep RuntimeDyld's implementation simple MCJIT imposed some
restrictions on compiled code:
#. It had to use the Large code model, and often restricted available relocation
   models in order to limit the kinds of relocations that had to be supported.
#. It required strong linkage and default visibility on all symbols -- behavior
   for other linkages/visibilities was not well defined.
#. It constrained and/or prohibited the use of features requiring runtime
   support, e.g. static initializers or thread local storage.
As a result of these restrictions not all language features supported by LLVM
worked under MCJIT, and objects to be loaded under the JIT had to be compiled to
target it (precluding the use of precompiled code from other sources under the
JIT).
RuntimeDyld also provided very limited visibility into the linking process
itself: Clients could access conservative estimates of section size
(RuntimeDyld bundled stub size and padding estimates into the section size
value) and the final relocated bytes, but could not access RuntimeDyld's
internal object representations.
Eliminating these restrictions and limitations was one of the primary motivations
for the development of JITLink.
The llvm-jitlink tool
=====================
The ``llvm-jitlink`` tool is a command line wrapper for the JITLink library.
It loads some set of relocatable object files and then links them using
JITLink. Depending on the options used it will then execute them, or validate
the linked memory.
The ``llvm-jitlink`` tool was originally designed to aid JITLink development by
providing a simple environment for testing.
Basic usage
-----------
By default, ``llvm-jitlink`` will link the set of objects passed on the command
line, then search for a "main" function and execute it:
.. code-block:: sh
  % cat hello-world.c
  #include <stdio.h>
  int main(int argc, char *argv[]) {
    printf("hello, world!\n");
    return 0;
  }
  % clang -c -o hello-world.o hello-world.c
  % llvm-jitlink hello-world.o
  Hello, World!
Multiple objects may be specified, and arguments may be provided to the JIT'd
main function using the -args option:
.. code-block:: sh
  % cat print-args.c
  #include <stdio.h>
  void print_args(int argc, char *argv[]) {
    for (int i = 0; i != argc; ++i)
      printf("arg %i is \"%s\"\n", i, argv[i]);
  }
  % cat print-args-main.c
  void print_args(int argc, char *argv[]);
  int main(int argc, char *argv[]) {
    print_args(argc, argv);
    return 0;
  }
  % clang -c -o print-args.o print-args.c
  % clang -c -o print-args-main.o print-args-main.c
  % llvm-jitlink print-args.o print-args-main.o -args a b c
  arg 0 is "a"
  arg 1 is "b"
  arg 2 is "c"
Alternative entry points may be specified using the ``-entry <entry point
name>`` option.
Other options can be found by calling ``llvm-jitlink -help``.
llvm-jitlink as a regression testing utility
--------------------------------------------
One of the primary aims of ``llvm-jitlink`` was to enable readable regression
tests for JITLink. To do this it supports two options:
The ``-noexec`` option tells llvm-jitlink to stop after looking up the entry
point, and before attempting to execute it. Since the linked code is not
executed, this can be used to link for other targets even if you do not have
access to the target being linked (the ``-define-abs`` or ``-phony-externals``
options can be used to supply any missing definitions in this case).
The ``-check <check-file>`` option can be used to run a set of ``jitlink-check``
expressions against working memory. It is typically used in conjunction with
``-noexec``, since the aim is to validate JIT'd memory rather than to run the
code and ``-noexec`` allows us to link for any supported target architecture
from the current process. In ``-check`` mode, ``llvm-jitlink`` will scan the
given check-file for lines of the form ``# jitlink-check: <expr>``. See
examples of this usage in ``llvm/test/ExecutionEngine/JITLink``.
Remote execution via llvm-jitlink-executor
------------------------------------------
By default ``llvm-jitlink`` will link the given objects into its own process,
but this can be overridden by two options:
The ``-oop-executor[=/path/to/executor]`` option tells ``llvm-jitlink`` to
execute the given executor (which defaults to ``llvm-jitlink-executor``) and
communicate with it via file descriptors which it passes to the executor
as the first argument with the format ``filedescs=<in-fd>,<out-fd>``.
The ``-oop-executor-connect=<host>:<port>`` option tells ``llvm-jitlink`` to
connect to an already running executor via TCP on the given host and port. To
use this option you will need to start ``llvm-jitlink-executor`` manually with
``listen=<host>:<port>`` as the first argument.
Harness mode
------------
The ``-harness`` option allows a set of input objects to be designated as a test
harness, with the regular object files implicitly treated as objects to be
tested. Definitions of symbols in the harness set override definitions in the
test set, and external references from the harness cause automatic scope
promotion of local symbols in the test set (these modifications to the usual
linker rules are accomplished via an ``ObjectLinkingLayer::Plugin`` installed by
``llvm-jitlink`` when it sees the ``-harness`` option).
With these modifications in place we can selectively test functions in an object
file by mocking those function's callees. For example, suppose we have an object
file, ``test_code.o``, compiled from the following C source (which we need not
have access to):
.. code-block:: c
  void irrelevant_function() { irrelevant_external(); }
  int function_to_mock(int X) {
    return /* some function of X */;
  }
  static void function_to_test() {
    ...
    int Y = function_to_mock();
    printf("Y is %i\n", Y);
  }
If we want to know how ``function_to_test`` behaves when we change the behavior
of ``function_to_mock`` we can test it by writing a test harness:
.. code-block:: c
  void function_to_test();
  int function_to_mock(int X) {
    printf("used mock utility function\n");
    return 42;
  }
  int main(int argc, char *argv[]) {
    function_to_test():
    return 0;
  }
Under normal circumstances these objects could not be linked together:
``function_to_test`` is static and could not be resolved outside
``test_code.o``, the two ``function_to_mock`` functions would result in a
duplicate definition error, and ``irrelevant_external`` is undefined.
However, using ``-harness`` and ``-phony-externals`` we can run this code
with:
.. code-block:: sh
  % clang -c -o test_code_harness.o test_code_harness.c
  % llvm-jitlink -phony-externals test_code.o -harness test_code_harness.o
  used mock utility function
  Y is 42
The ``-harness`` option may be of interest to people who want to perform some
very late testing on build products to verify that compiled code behaves as
expected. On basic C test cases this is relatively straightforward. Mocks for
more complicated languages (e.g. C++) are much trickier: Any code involving
classes tends to have a lot of non-trivial surface area (e.g. vtables) that
would require great care to mock.
Tips for JITLink backend developers
-----------------------------------
#. Make liberal use of assert and ``llvm::Error``. Do *not* assume that the input
   object is well formed: Return any errors produced by libObject (or your own
   object parsing code) and validate as you construct. Think carefully about the
   distinction between contract (which should be validated with asserts and
   llvm_unreachable) and environmental errors (which should generate
   ``llvm::Error`` instances).
#. Don't assume you're linking in-process. Use libSupport's sized,
   endian-specific types when reading/writing content in the ``LinkGraph``.
As a "minimum viable" JITLink wrapper, the ``llvm-jitlink`` tool is an
invaluable resource for developers bringing in a new JITLink backend. A standard
workflow is to start by throwing an unsupported object at the tool and seeing
what error is returned, then fixing that (you can often make a reasonable guess
at what should be done based on existing code for other formats or
architectures).
In debug builds of LLVM, the ``-debug-only=jitlink`` option dumps logs from the
JITLink library during the link process. These can be useful for spotting some bugs at
a glance. The ``-debug-only=llvm_jitlink`` option dumps logs from the ``llvm-jitlink``
tool, which can be useful for debugging both testcases (it is often less verbose than
``-debug-only=jitlink``) and the tool itself.
The ``-oop-executor`` and ``-oop-executor-connect`` options are helpful for testing
handling of cross-process and cross-architecture use cases.
Roadmap
=======
JITLink is under active development. Work so far has focused on the MachO
implementation. In LLVM 12 there is limited support for ELF on x86-64.
Major outstanding projects include:
* Refactor architecture support to maximize sharing across formats.
  All formats should be able to share the bulk of the architecture specific
  code (especially relocations) for each supported architecture.
* Refactor ELF link graph construction.
  ELF's link graph construction is currently implemented in the `ELF_x86_64.cpp`
  file, and tied to the x86-64 relocation parsing code. The bulk of the code is
  generic and should be split into an ELFLinkGraphBuilder base class along the
  same lines as the existing generic MachOLinkGraphBuilder.
* Implement ELF support for arm64.
  Once the architecture support code has been refactored to enable sharing and
  ELF link graph construction has been refactored to allow re-use we should be
  able to construct an ELF / arm64 JITLink implementation by combining
  these existing pieces.
* Implement support for new architectures.
* Implement support for COFF.
  There is no COFF implementation of JITLink yet. Such an implementation should
  follow the MachO and ELF paths: a generic COFFLinkGraphBuilder base class that
  can be specialized for each architecture.
* Design and implement a shared-memory based JITLinkMemoryManager.
  One use-case that is expected to be common is out-of-process linking targeting
  another process on the same machine. This allows JITs to sandbox JIT'd code.
  For this use case a shared-memory based JITLinkMemoryManager would provide the
  most efficient form of allocation. Creating one will require designing a
  generic API for shared memory though, as LLVM does not currently have one.
JITLink Availability and Feature Status
---------------------------------------
.. list-table:: Availability and Status
   :widths: 10 30 30 30
   :header-rows: 1
   * - Architecture
     - ELF
     - COFF
     - MachO
   * - arm64
     -
     -
     - Partial (small code model, PIC relocation model only)
   * - x86-64
     - Partial
     -
     - Full (except TLV and debugging)
.. [1] See ``llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin`` for
       a full worked example.
.. [2] If not for *hidden* scoped symbols we could eliminate the
       ``JITLinkDylib*`` argument to ``JITLinkMemoryManager::allocate`` and
       treat every object as a separate simulated dylib for the purposes of
       memory layout. Hidden symbols break this by generating in-range accesses
       to external symbols, requiring the access and symbol to be allocated
       within range of one another. That said, providing a pre-reserved address
       range pool for each simulated dylib guarantees that the relaxation
       optimizations will kick in for all intra-dylib references, which is good
       for performance (at the cost of whatever overhead is introduced by
       reserving the address-range up-front).
 |