LLD - The LLVM Linker

LLD is a linker from the LLVM project. That is a drop-in replacement for system linkers and runs much faster than them. It also provides features that are useful for toolchain developers.

The linker supports ELF (Unix), PE/COFF (Windows), Mach-O (macOS) and WebAssembly in descending order of completeness. Internally, LLD consists of several different linkers. The ELF port is the one that will be described in this document. The PE/COFF port is complete, including Windows debug info (PDB) support. The WebAssembly port is still a work in progress (See WebAssembly lld port). The Mach-O port is built based on a different architecture than the others. For the details about Mach-O, please read ATOM-based lld.

Features

  • LLD is a drop-in replacement for the GNU linkers. That accepts the same command line arguments and linker scripts as GNU.

    We are currently working closely with the FreeBSD project to make LLD default system linker in future versions of the operating system, so we are serious about addressing compatibility issues. As of February 2017, LLD is able to link the entire FreeBSD/amd64 base system including the kernel. With a few work-in-progress patches it can link approximately 95% of the ports collection on AMD64. For the details, see FreeBSD quarterly status report.

  • LLD is very fast. When you link a large program on a multicore machine, you can expect that LLD runs more than twice as fast as GNU gold linker. Your milage may vary, though.

  • It supports various CPUs/ABIs including x86-64, x86, x32, AArch64, ARM, MIPS 32/64 big/little-endian, PowerPC, PowerPC 64 and AMDGPU. Among these, x86-64 is the most well-supported target and have reached production quality. AArch64 and MIPS seem decent too. x86 should be OK but not well tested yet. ARM support is being developed actively.

  • It is always a cross-linker, meaning that it always supports all the above targets however it was built. In fact, we don’t provide a build-time option to enable/disable each target. This should make it easy to use our linker as part of a cross-compile toolchain.

  • You can embed LLD to your program to eliminate dependency to external linkers. All you have to do is to construct object files and command line arguments just like you would do to invoke an external linker and then call the linker’s main function, lld::elf::link, from your code.

  • It is small. We are using LLVM libObject library to read from object files, so it is not completely a fair comparison, but as of February 2017, LLD/ELF consists only of 21k lines of C++ code while GNU gold consists of 198k lines of C++ code.

  • Link-time optimization (LTO) is supported by default. Essentially, all you have to do to do LTO is to pass the -flto option to clang. Then clang creates object files not in the native object file format but in LLVM bitcode format. LLD reads bitcode object files, compile them using LLVM and emit an output file. Because in this way LLD can see the entire program, it can do the whole program optimization.

  • Some very old features for ancient Unix systems (pre-90s or even before that) have been removed. Some default settings have been tuned for the 21st century. For example, the stack is marked as non-executable by default to tighten security.

Performance

This is a link time comparison on a 2-socket 20-core 40-thread Xeon E5-2680 2.80 GHz machine with an SSD drive. We ran gold and lld with or without multi-threading support. To disable multi-threading, we added -no-threads to the command lines.

Program Output size GNU ld GNU gold w/o threads GNU gold w/threads lld w/o threads lld w/threads
ffmpeg dbg 92 MiB 1.72s 1.16s 1.01s 0.60s 0.35s
mysqld dbg 154 MiB 8.50s 2.96s 2.68s 1.06s 0.68s
clang dbg 1.67 GiB 104.03s 34.18s 23.49s 14.82s 5.28s
chromium dbg 1.14 GiB 209.05s [1] 64.70s 60.82s 27.60s 16.70s

As you can see, lld is significantly faster than GNU linkers. Note that this is just a benchmark result of our environment. Depending on number of available cores, available amount of memory or disk latency/throughput, your results may vary.

[1]Since GNU ld doesn’t support the -icf=all and -gdb-index options, we removed them from the command line for GNU ld. GNU ld would have been slower than this if it had these options.

Build

If you have already checked out LLVM using SVN, you can check out LLD under tools directory just like you probably did for clang. For the details, see Getting Started with the LLVM System.

If you haven’t checkout out LLVM, the easiest way to build LLD is to checkout the entire LLVM projects/sub-projects from a git mirror and build that tree. You need cmake and of course a C++ compiler.

$ git clone https://github.com/llvm-project/llvm-project-20170507 llvm-project
$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=lld -DCMAKE_INSTALL_PREFIX=/usr/local ../llvm-project/llvm
$ make install

Using LLD

LLD is installed as ld.lld. On Unix, linkers are invoked by compiler drivers, so you are not expected to use that command directly. There are a few ways to tell compiler drivers to use ld.lld instead of the default linker.

The easiest way to do that is to overwrite the default linker. After installing LLD to somewhere on your disk, you can create a symbolic link by doing ln -s /path/to/ld.lld /usr/bin/ld so that /usr/bin/ld is resolved to LLD.

If you don’t want to change the system setting, you can use clang’s -fuse-ld option. In this way, you want to set -fuse-ld=lld to LDFLAGS when building your programs.

LLD leaves its name and version number to a .comment section in an output. If you are in doubt whether you are successfully using LLD or not, run readelf --string-dump .comment <output-file> and examine the output. If the string “Linker: LLD” is included in the output, you are using LLD.

History

Here is a brief project history of the ELF and COFF ports.

  • May 2015: We decided to rewrite the COFF linker and did that. Noticed that the new linker is much faster than the MSVC linker.
  • July 2015: The new ELF port was developed based on the COFF linker architecture.
  • September 2015: The first patches to support MIPS and AArch64 landed.
  • October 2015: Succeeded to self-host the ELF port. We have noticed that the linker was faster than the GNU linkers, but we weren’t sure at the time if we would be able to keep the gap as we would add more features to the linker.
  • July 2016: Started working on improving the linker script support.
  • December 2016: Succeeded to build the entire FreeBSD base system including the kernel. We had widen the performance gap against the GNU linkers.

Internals

For the internals of the linker, please read The ELF, COFF and Wasm Linkers. It is a bit outdated but the fundamental concepts remain valid. We’ll update the document soon.

The ELF, COFF and Wasm Linkers

The ELF Linker as a Library

You can embed LLD to your program by linking against it and calling the linker’s entry point function lld::elf::link.

The current policy is that it is your reponsibility to give trustworthy object files. The function is guaranteed to return as long as you do not pass corrupted or malicious object files. A corrupted file could cause a fatal error or SEGV. That being said, you don’t need to worry too much about it if you create object files in the usual way and give them to the linker. It is naturally expected to work, or otherwise it’s a linker’s bug.

Design

We will describe the design of the linkers in the rest of the document.

Key Concepts

Linkers are fairly large pieces of software. There are many design choices you have to make to create a complete linker.

This is a list of design choices we’ve made for ELF and COFF LLD. We believe that these high-level design choices achieved a right balance between speed, simplicity and extensibility.

  • Implement as native linkers

    We implemented the linkers as native linkers for each file format.

    The linkers share the same design but share very little code. Sharing code makes sense if the benefit is worth its cost. In our case, the object formats are different enough that we thought the layer to abstract the differences wouldn’t be worth its complexity and run-time cost. Elimination of the abstract layer has greatly simplified the implementation.

  • Speed by design

    One of the most important things in archiving high performance is to do less rather than do it efficiently. Therefore, the high-level design matters more than local optimizations. Since we are trying to create a high-performance linker, it is very important to keep the design as efficient as possible.

    Broadly speaking, we do not do anything until we have to do it. For example, we do not read section contents or relocations until we need them to continue linking. When we need to do some costly operation (such as looking up a hash table for each symbol), we do it only once. We obtain a handler (which is typically just a pointer to actual data) on the first operation and use it throughout the process.

  • Efficient archive file handling

    LLD’s handling of archive files (the files with “.a” file extension) is different from the traditional Unix linkers and similar to Windows linkers. We’ll describe how the traditional Unix linker handles archive files, what the problem is, and how LLD approached the problem.

    The traditional Unix linker maintains a set of undefined symbols during linking. The linker visits each file in the order as they appeared in the command line until the set becomes empty. What the linker would do depends on file type.

    • If the linker visits an object file, the linker links object files to the result, and undefined symbols in the object file are added to the set.
    • If the linker visits an archive file, it checks for the archive file’s symbol table and extracts all object files that have definitions for any symbols in the set.

    This algorithm sometimes leads to a counter-intuitive behavior. If you give archive files before object files, nothing will happen because when the linker visits archives, there is no undefined symbols in the set. As a result, no files are extracted from the first archive file, and the link is done at that point because the set is empty after it visits one file.

    You can fix the problem by reordering the files, but that cannot fix the issue of mutually-dependent archive files.

    Linking mutually-dependent archive files is tricky. You may specify the same archive file multiple times to let the linker visit it more than once. Or, you may use the special command line options, –start-group and –end-group, to let the linker loop over the files between the options until no new symbols are added to the set.

    Visiting the same archive files multiple makes the linker slower.

    Here is how LLD approaches the problem. Instead of memorizing only undefined symbols, we program LLD so that it memorizes all symbols. When it sees an undefined symbol that can be resolved by extracting an object file from an archive file it previously visited, it immediately extracts the file and link it. It is doable because LLD does not forget symbols it have seen in archive files.

    We believe that the LLD’s way is efficient and easy to justify.

    The semantics of LLD’s archive handling is different from the traditional Unix’s. You can observe it if you carefully craft archive files to exploit it. However, in reality, we don’t know any program that cannot link with our algorithm so far, so it’s not going to cause trouble.

Numbers You Want to Know

To give you intuition about what kinds of data the linker is mainly working on, I’ll give you the list of objects and their numbers LLD has to read and process in order to link a very large executable. In order to link Chrome with debug info, which is roughly 2 GB in output size, LLD reads

  • 17,000 files,
  • 1,800,000 sections,
  • 6,300,000 symbols, and
  • 13,000,000 relocations.

LLD produces the 2 GB executable in 15 seconds.

These numbers vary depending on your program, but in general, you have a lot of relocations and symbols for each file. If your program is written in C++, symbol names are likely to be pretty long because of name mangling.

It is important to not waste time on relocations and symbols.

In the above case, the total amount of symbol strings is 450 MB, and inserting all of them to a hash table takes 1.5 seconds. Therefore, if you causally add a hash table lookup for each symbol, it would slow down the linker by 10%. So, don’t do that.

On the other hand, you don’t have to pursue efficiency when handling files.

Important Data Structures

We will describe the key data structures in LLD in this section. The linker can be understood as the interactions between them. Once you understand their functions, the code of the linker should look obvious to you.

  • Symbol

    This class represents a symbol. They are created for symbols in object files or archive files. The linker creates linker-defined symbols as well.

    There are basically three types of Symbols: Defined, Undefined, or Lazy.

    • Defined symbols are for all symbols that are considered as “resolved”, including real defined symbols, COMDAT symbols, common symbols, absolute symbols, linker-created symbols, etc.
    • Undefined symbols represent undefined symbols, which need to be replaced by Defined symbols by the resolver until the link is complete.
    • Lazy symbols represent symbols we found in archive file headers which can turn into Defined if we read archieve members.

    There’s only one Symbol instance for each unique symbol name. This uniqueness is guaranteed by the symbol table. As the resolver reads symbols from input files, it replaces an existing Symbol with the “best” Symbol for its symbol name using the placement new.

    The above mechanism allows you to use pointers to Symbols as a very cheap way to access name resolution results. Assume for example that you have a pointer to an undefined symbol before name resolution. If the symbol is resolved to a defined symbol by the resolver, the pointer will “automatically” point to the defined symbol, because the undefined symbol the pointer pointed to will have been replaced by the defined symbol in-place.

  • SymbolTable

    SymbolTable is basically a hash table from strings to Symbols with logic to resolve symbol conflicts. It resolves conflicts by symbol type.

    • If we add Defined and Undefined symbols, the symbol table will keep the former.
    • If we add Defined and Lazy symbols, it will keep the former.
    • If we add Lazy and Undefined, it will keep the former, but it will also trigger the Lazy symbol to load the archive member to actually resolve the symbol.
  • Chunk (COFF specific)

    Chunk represents a chunk of data that will occupy space in an output. Each regular section becomes a chunk. Chunks created for common or BSS symbols are not backed by sections. The linker may create chunks to append additional data to an output as well.

    Chunks know about their size, how to copy their data to mmap’ed outputs, and how to apply relocations to them. Specifically, section-based chunks know how to read relocation tables and how to apply them.

  • InputSection (ELF specific)

    Since we have less synthesized data for ELF, we don’t abstract slices of input files as Chunks for ELF. Instead, we directly use the input section as an internal data type.

    InputSection knows about their size and how to copy themselves to mmap’ed outputs, just like COFF Chunks.

  • OutputSection

    OutputSection is a container of InputSections (ELF) or Chunks (COFF). An InputSection or Chunk belongs to at most one OutputSection.

There are mainly three actors in this linker.

  • InputFile

    InputFile is a superclass of file readers. We have a different subclass for each input file type, such as regular object file, archive file, etc. They are responsible for creating and owning Symbols and InputSections/Chunks.

  • Writer

    The writer is responsible for writing file headers and InputSections/Chunks to a file. It creates OutputSections, put all InputSections/Chunks into them, assign unique, non-overlapping addresses and file offsets to them, and then write them down to a file.

  • Driver

    The linking process is driven by the driver. The driver:

    • processes command line options,
    • creates a symbol table,
    • creates an InputFile for each input file and puts all symbols within into the symbol table,
    • checks if there’s no remaining undefined symbols,
    • creates a writer,
    • and passes the symbol table to the writer to write the result to a file.

Glossary

  • RVA (COFF)

    Short for Relative Virtual Address.

    Windows executables or DLLs are not position-independent; they are linked against a fixed address called an image base. RVAs are offsets from an image base.

    Default image bases are 0x140000000 for executables and 0x18000000 for DLLs. For example, when we are creating an executable, we assume that the executable will be loaded at address 0x140000000 by the loader, so we apply relocations accordingly. Result texts and data will contain raw absolute addresses.

  • VA

    Short for Virtual Address. For COFF, it is equivalent to RVA + image base.

  • Base relocations (COFF)

    Relocation information for the loader. If the loader decides to map an executable or a DLL to a different address than their image bases, it fixes up binaries using information contained in the base relocation table. A base relocation table consists of a list of locations containing addresses. The loader adds a difference between RVA and actual load address to all locations listed there.

    Note that this run-time relocation mechanism is much simpler than ELF. There’s no PLT or GOT. Images are relocated as a whole just by shifting entire images in memory by some offsets. Although doing this breaks text sharing, I think this mechanism is not actually bad on today’s computers.

  • ICF

    Short for Identical COMDAT Folding (COFF) or Identical Code Folding (ELF).

    ICF is an optimization to reduce output size by merging read-only sections by not only their names but by their contents. If two read-only sections happen to have the same metadata, actual contents and relocations, they are merged by ICF. It is known as an effective technique, and it usually reduces C++ program’s size by a few percent or more.

    Note that this is not an entirely sound optimization. C/C++ require different functions have different addresses. If a program depends on that property, it would fail at runtime.

    On Windows, that’s not really an issue because MSVC link.exe enabled the optimization by default. As long as your program works with the linker’s default settings, your program should be safe with ICF.

    On Unix, your program is generally not guaranteed to be safe with ICF, although large programs happen to work correctly. LLD works fine with ICF for example.

ATOM-based lld

Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.

ATOM-based lld is a new set of modular code for creating linker tools. Currently it supports Mach-O.

  • End-User Features:
    • Compatible with existing linker options
    • Reads standard Object Files
    • Writes standard Executable Files
    • Remove clang’s reliance on “the system linker”
    • Uses the LLVM “UIUC” BSD-Style license.
  • Applications:
    • Modular design
    • Support cross linking
    • Easy to add new CPU support
    • Can be built as static tool or library
  • Design and Implementation:
    • Extensive unit tests
    • Internal linker model can be dumped/read to textual format
    • Additional linking features can be plugged in as “passes”
    • OS specific and CPU specific code factored out

Why a new linker?

The fact that clang relies on whatever linker tool you happen to have installed means that clang has been very conservative adopting features which require a recent linker.

In the same way that the MC layer of LLVM has removed clang’s reliance on the system assembler tool, the lld project will remove clang’s reliance on the system linker tool.

Contents

Linker Design

Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.

Introduction

lld is a new generation of linker. It is not “section” based like traditional linkers which mostly just interlace sections from multiple object files into the output file. Instead, lld is based on “Atoms”. Traditional section based linking work well for simple linking, but their model makes advanced linking features difficult to implement. Features like dead code stripping, reordering functions for locality, and C++ coalescing require the linker to work at a finer grain.

An atom is an indivisible chunk of code or data. An atom has a set of attributes, such as: name, scope, content-type, alignment, etc. An atom also has a list of References. A Reference contains: a kind, an optional offset, an optional addend, and an optional target atom.

The Atom model allows the linker to use standard graph theory models for linking data structures. Each atom is a node, and each Reference is an edge. The feature of dead code stripping is implemented by following edges to mark all live atoms, and then delete the non-live atoms.

Atom Model

An atom is an indivisible chunk of code or data. Typically each user written function or global variable is an atom. In addition, the compiler may emit other atoms, such as for literal c-strings or floating point constants, or for runtime data structures like dwarf unwind info or pointers to initializers.

A simple “hello world” object file would be modeled like this:

_images/hello.png

There are three atoms: main, a proxy for printf, and an anonymous atom containing the c-string literal “hello world”. The Atom “main” has two references. One is the call site for the call to printf, and the other is a reference for the instruction that loads the address of the c-string literal.

There are only four different types of atoms:

  • DefinedAtom

    95% of all atoms. This is a chunk of code or data

  • UndefinedAtom

    This is a place holder in object files for a reference to some atom outside the translation unit.During core linking it is usually replaced by (coalesced into) another Atom.

  • SharedLibraryAtom

    If a required symbol name turns out to be defined in a dynamic shared library (and not some object file). A SharedLibraryAtom is the placeholder Atom used to represent that fact.

    It is similar to an UndefinedAtom, but it also tracks information about the associated shared library.

  • AbsoluteAtom

    This is for embedded support where some stuff is implemented in ROM at some fixed address. This atom has no content. It is just an address that the Writer needs to fix up any references to point to.

File Model

The linker views the input files as basically containers of Atoms and References, and just a few attributes of their own. The linker works with three kinds of files: object files, static libraries, and dynamic shared libraries. Each kind of file has reader object which presents the file in the model expected by the linker.

Object File

An object file is just a container of atoms. When linking an object file, a reader is instantiated which parses the object file and instantiates a set of atoms representing all content in the .o file. The linker adds all those atoms to a master graph.

Static Library (Archive)

This is the traditional unix static archive which is just a collection of object files with a “table of contents”. When linking with a static library, by default nothing is added to the master graph of atoms. Instead, if after merging all atoms from object files into a master graph, if any “undefined” atoms are left remaining in the master graph, the linker reads the table of contents for each static library to see if any have the needed definitions. If so, the set of atoms from the specified object file in the static library is added to the master graph of atoms.

Dynamic Library (Shared Object)

Dynamic libraries are different than object files and static libraries in that they don’t directly add any content. Their purpose is to check at build time that the remaining undefined references can be resolved at runtime, and provide a list of dynamic libraries (SO_NEEDED) that will be needed at runtime. The way this is modeled in the linker is that a dynamic library contributes no atoms to the initial graph of atoms. Instead, (like static libraries) if there are “undefined” atoms in the master graph of all atoms, then each dynamic library is checked to see if exports the required symbol. If so, a “shared library” atom is instantiated by the by the reader which the linker uses to replace the “undefined” atom.

Linking Steps

Through the use of abstract Atoms, the core of linking is architecture independent and file format independent. All command line parsing is factored out into a separate “options” abstraction which enables the linker to be driven with different command line sets.

The overall steps in linking are:

  1. Command line processing
  2. Parsing input files
  3. Resolving
  4. Passes/Optimizations
  5. Generate output file

The Resolving and Passes steps are done purely on the master graph of atoms, so they have no notion of file formats such as mach-o or ELF.

Input Files

Existing developer tools using different file formats for object files. A goal of lld is to be file format independent. This is done through a plug-in model for reading object files. The lld::Reader is the base class for all object file readers. A Reader follows the factory method pattern. A Reader instantiates an lld::File object (which is a graph of Atoms) from a given object file (on disk or in-memory).

Every Reader subclass defines its own “options” class (for instance the mach-o Reader defines the class ReaderOptionsMachO). This options class is the one-and-only way to control how the Reader operates when parsing an input file into an Atom graph. For instance, you may want the Reader to only accept certain architectures. The options class can be instantiated from command line options, or it can be subclassed and the ivars programmatically set.

Resolving

The resolving step takes all the atoms’ graphs from each object file and combines them into one master object graph. Unfortunately, it is not as simple as appending the atom list from each file into one big list. There are many cases where atoms need to be coalesced. That is, two or more atoms need to be coalesced into one atom. This is necessary to support: C language “tentative definitions”, C++ weak symbols for templates and inlines defined in headers, replacing undefined atoms with actual definition atoms, and for merging copies of constants like c-strings and floating point constants.

The linker support coalescing by-name and by-content. By-name is used for tentative definitions and weak symbols. By-content is used for constant data that can be merged.

The resolving process maintains some global linking “state”, including a “symbol table” which is a map from llvm::StringRef to lld::Atom*. With these data structures, the linker iterates all atoms in all input files. For each atom, it checks if the atom is named and has a global or hidden scope. If so, the atom is added to the symbol table map. If there already is a matching atom in that table, that means the current atom needs to be coalesced with the found atom, or it is a multiple definition error.

When all initial input file atoms have been processed by the resolver, a scan is made to see if there are any undefined atoms in the graph. If there are, the linker scans all libraries (both static and dynamic) looking for definitions to replace the undefined atoms. It is an error if any undefined atoms are left remaining.

Dead code stripping (if requested) is done at the end of resolving. The linker does a simple mark-and-sweep. It starts with “root” atoms (like “main” in a main executable) and follows each references and marks each Atom that it visits as “live”. When done, all atoms not marked “live” are removed.

The result of the Resolving phase is the creation of an lld::File object. The goal is that the lld::File model is the internal representation throughout the linker. The file readers parse (mach-o, ELF, COFF) into an lld::File. The file writers (mach-o, ELF, COFF) taken an lld::File and produce their file kind, and every Pass only operates on an lld::File. This is not only a simpler, consistent model, but it enables the state of the linker to be dumped at any point in the link for testing purposes.

Passes

The Passes step is an open ended set of routines that each get a change to modify or enhance the current lld::File object. Some example Passes are:

  • stub (PLT) generation
  • GOT instantiation
  • order_file optimization
  • branch island generation
  • branch shim generation
  • Objective-C optimizations (Darwin specific)
  • TLV instantiation (Darwin specific)
  • DTrace probe processing (Darwin specific)
  • compact unwind encoding (Darwin specific)

Some of these passes are specific to Darwin’s runtime environments. But many of the passes are applicable to any OS (such as generating branch island for out of range branch instructions).

The general structure of a pass is to iterate through the atoms in the current lld::File object, inspecting each atom and doing something. For instance, the stub pass, looks for call sites to shared library atoms (e.g. call to printf). It then instantiates a “stub” atom (PLT entry) and a “lazy pointer” atom for each proxy atom needed, and these new atoms are added to the current lld::File object. Next, all the noted call sites to shared library atoms have their References altered to point to the stub atom instead of the shared library atom.

Generate Output File

Once the passes are done, the output file writer is given current lld::File object. The writer’s job is to create the executable content file wrapper and place the content of the atoms into it.

lld uses a plug-in model for writing output files. All concrete writers (e.g. ELF, mach-o, etc) are subclasses of the lld::Writer class.

Unlike the Reader class which has just one method to instantiate an lld::File, the Writer class has multiple methods. The crucial method is to generate the output file, but there are also methods which allow the Writer to contribute Atoms to the resolver and specify passes to run.

An example of contributing atoms is that if the Writer knows a main executable is being linked and such an executable requires a specially named entry point (e.g. “_main”), the Writer can add an UndefinedAtom with that special name to the resolver. This will cause the resolver to issue an error if that symbol is not defined.

Sometimes a Writer supports lazily created symbols, such as names for the start of sections. To support this, the Writer can create a File object which vends no initial atoms, but does lazily supply atoms by name as needed.

Every Writer subclass defines its own “options” class (for instance the mach-o Writer defines the class WriterOptionsMachO). This options class is the one-and-only way to control how the Writer operates when producing an output file from an Atom graph. For instance, you may want the Writer to optimize the output for certain OS versions, or strip local symbols, etc. The options class can be instantiated from command line options, or it can be subclassed and the ivars programmatically set.

lld::File representations

Just as LLVM has three representations of its IR model, lld has two representations of its File/Atom/Reference model:

  • In memory, abstract C++ classes (lld::Atom, lld::Reference, and lld::File).
  • textual (in YAML)
Textual representations in YAML

In designing a textual format we want something easy for humans to read and easy for the linker to parse. Since an atom has lots of attributes most of which are usually just the default, we should define default values for every attribute so that those can be omitted from the text representation. Here is the atoms for a simple hello world program expressed in YAML:

target-triple:   x86_64-apple-darwin11

atoms:
    - name:    _main
      scope:   global
      type:    code
      content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00,
                 00, 00, 31, c0, 5d, c3 ]
      fixups:
      - offset: 07
        kind:   pcrel32
        target: 2
      - offset: 0E
        kind:   call32
        target: _fprintf

    - type:    c-string
      content: [ 73, 5A, 00 ]

...

The biggest use for the textual format will be writing test cases. Writing test cases in C is problematic because the compiler may vary its output over time for its own optimization reasons which my inadvertently disable or break the linker feature trying to be tested. By writing test cases in the linkers own textual format, we can exactly specify every attribute of every atom and thus target specific linker logic.

The textual/YAML format follows the ReaderWriter patterns used in lld. The lld library comes with the classes: ReaderYAML and WriterYAML.

Testing

The lld project contains a test suite which is being built up as new code is added to lld. All new lld functionality should have a tests added to the test suite. The test suite is lit driven. Each test is a text file with comments telling lit how to run the test and check the result To facilitate testing, the lld project builds a tool called lld-core. This tool reads a YAML file (default from stdin), parses it into one or more lld::File objects in memory and then feeds those lld::File objects to the resolver phase.

Resolver testing

Basic testing is the “core linking” or resolving phase. That is where the linker merges object files. All test cases are written in YAML. One feature of YAML is that it allows multiple “documents” to be encoding in one YAML stream. That means one text file can appear to the linker as multiple .o files - the normal case for the linker.

Here is a simple example of a core linking test case. It checks that an undefined atom from one file will be replaced by a definition from another file:

# RUN: lld-core %s | FileCheck %s

#
# Test that undefined atoms are replaced with defined atoms.
#

---
atoms:
    - name:              foo
      definition:        undefined
---
atoms:
    - name:              foo
      scope:             global
      type:              code
...

# CHECK:       name:       foo
# CHECK:       scope:      global
# CHECK:       type:       code
# CHECK-NOT:   name:       foo
# CHECK:       ...
Passes testing

Since Passes just operate on an lld::File object, the lld-core tool has the option to run a particular pass (after resolving). Thus, you can write a YAML test case with carefully crafted input to exercise areas of a Pass and the check the resulting lld::File object as represented in YAML.

Design Issues

There are a number of open issues in the design of lld. The plan is to wait and make these design decisions when we need to.

Debug Info

Currently, the lld model says nothing about debug info. But the most popular debug format is DWARF and there is some impedance mismatch with the lld model and DWARF. In lld there are just Atoms and only Atoms that need to be in a special section at runtime have an associated section. Also, Atoms do not have addresses. The way DWARF is spec’ed different parts of DWARF are supposed to go into specially named sections and the DWARF references function code by address.

CPU and OS specific functionality

Currently, lld has an abstract “Platform” that deals with any CPU or OS specific differences in linking. We just keep adding virtual methods to the base Platform class as we find linking areas that might need customization. At some point we’ll need to structure this better.

File Attributes

Currently, lld::File just has a path and a way to iterate its atoms. We will need to add more attributes on a File. For example, some equivalent to the target triple. There is also a number of cached or computed attributes that could make various Passes more efficient. For instance, on Darwin there are a number of Objective-C optimizations that can be done by a Pass. But it would improve the plain C case if the Objective-C optimization Pass did not have to scan all atoms looking for any Objective-C data structures. This could be done if the lld::File object had an attribute that said if the file had any Objective-C data in it. The Resolving phase would then be required to “merge” that attribute as object files are added.

Getting Started: Building and Running lld

This page gives you the shortest path to checking out and building lld. If you run into problems, please file bugs in the LLVM Bugzilla

Building lld
On Unix-like Systems
  1. Get the required tools.
  • CMake 2.8+.
  • make (or any build system CMake supports).
  • Clang 3.1+ or GCC 4.7+ (C++11 support is required).
    • If using Clang, you will also need libc++.
  • Python 2.4+ (not 3.x) for running tests.
  1. Check out LLVM:

    $ cd path/to/llvm-project
    $ svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
    
  2. Check out lld:

    $ cd llvm/tools
    $ svn co http://llvm.org/svn/llvm-project/lld/trunk lld
    
  • lld can also be checked out to path/to/llvm-project and built as an external project.
  1. Build LLVM and lld:

    $ cd path/to/llvm-build/llvm (out of source build required)
    $ cmake -G "Unix Makefiles" path/to/llvm-project/llvm
    $ make
    
  • If you want to build with clang and it is not the default compiler or it is installed in an alternate location, you’ll need to tell the cmake tool the location of the C and C++ compiler via CMAKE_C_COMPILER and CMAKE_CXX_COMPILER. For example:

    $ cmake -DCMAKE_CXX_COMPILER=/path/to/clang++ -DCMAKE_C_COMPILER=/path/to/clang ...
    
  1. Test:

    $ make check-lld
    
Using Visual Studio
  1. Get the required tools.
  1. Check out LLVM:

    $ cd path/to/llvm-project
    $ svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
    
  2. Check out lld:

    $ cd llvm/tools
    $ svn co http://llvm.org/svn/llvm-project/lld/trunk lld
    
  • lld can also be checked out to path/to/llvm-project and built as an external project.
  1. Generate Visual Studio project files:

    $ cd path/to/llvm-build/llvm (out of source build required)
    $ cmake -G "Visual Studio 11" path/to/llvm-project/llvm
    
  2. Build

  • Open LLVM.sln in Visual Studio.
  • Build the ALL_BUILD target.
  1. Test
  • Build the lld-test target.
More Information

For more information on using CMake see the LLVM CMake guide.

Development

Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.

lld is developed as part of the LLVM project.

Creating a Reader

See the Creating a Reader guide.

Modifying the Driver

See Driver.

Debugging

You can run lld with -mllvm -debug command line options to enable debugging printouts. If you want to enable debug information for some specific pass, you can run it with -mllvm '-debug-only=<pass>', where pass is a name used in the DEBUG_WITH_TYPE() macro.

Documentation

The project documentation is written in reStructuredText and generated using the Sphinx documentation generator. For more information on writing documentation for the project, see the Sphinx Introduction for LLVM Developers.

Developing lld Readers

Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.

Introduction

The purpose of a “Reader” is to take an object file in a particular format and create an lld::File (which is a graph of Atoms) representing the object file. A Reader inherits from lld::Reader which lives in include/lld/Core/Reader.h and lib/Core/Reader.cpp.

The Reader infrastructure for an object format Foo requires the following pieces in order to fit into lld:

include/lld/ReaderWriter/ReaderFoo.h

class ReaderOptionsFoo : public ReaderOptions

This Options class is the only way to configure how the Reader will parse any file into an lld::Reader object. This class should be declared in the lld namespace.

Reader *createReaderFoo(ReaderOptionsFoo &reader)

This factory function configures and create the Reader. This function should be declared in the lld namespace.

lib/ReaderWriter/Foo/ReaderFoo.cpp

class ReaderFoo : public Reader

This is the concrete Reader class which can be called to parse object files. It should be declared in an anonymous namespace or if there is shared code with the lld::WriterFoo you can make a nested namespace (e.g. lld::foo).

You may have noticed that ReaderFoo is not declared in the .h file. An important design aspect of lld is that all Readers are created only through an object-format-specific createReaderFoo() factory function. The creation of the Reader is parametrized through a ReaderOptionsFoo class. This options class is the one-and-only way to control how the Reader operates when parsing an input file into an Atom graph. For instance, you may want the Reader to only accept certain architectures. The options class can be instantiated from command line options or be programmatically configured.

Where to start

The lld project already has a skeleton of source code for Readers for ELF, PECOFF, MachO, and lld’s native YAML graph format. If your file format is a variant of one of those, you should modify the existing Reader to support your variant. This is done by customizing the Options class for the Reader and making appropriate changes to the .cpp file to interpret those options and act accordingly.

If your object file format is not a variant of any existing Reader, you’ll need to create a new Reader subclass with the organization described above.

Readers are factories

The linker will usually only instantiate your Reader once. That one Reader will have its loadFile() method called many times with different input files. To support multithreaded linking, the Reader may be parsing multiple input files in parallel. Therefore, there should be no parsing state in you Reader object. Any parsing state should be in ivars of your File subclass or in some temporary object.

The key method to implement in a reader is:

virtual error_code loadFile(LinkerInput &input,
                            std::vector<std::unique_ptr<File>> &result);

It takes a memory buffer (which contains the contents of the object file being read) and returns an instantiated lld::File object which is a collection of Atoms. The result is a vector of File pointers (instead of simple a File pointer) because some file formats allow multiple object “files” to be encoded in one file system file.

Memory Ownership

Atoms are always owned by their File object. During core linking when Atoms are coalesced or stripped away, core linking does not delete them. Core linking just removes those unused Atoms from its internal list. The destructor of a File object is responsible for deleting all Atoms it owns, and if ownership of the MemoryBuffer was passed to it, the File destructor needs to delete that too.

Making Atoms

The internal model of lld is purely Atom based. But most object files do not have an explicit concept of Atoms, instead most have “sections”. The way to think of this is that a section is just a list of Atoms with common attributes.

The first step in parsing section-based object files is to cleave each section into a list of Atoms. The technique may vary by section type. For code sections (e.g. .text), there are usually symbols at the start of each function. Those symbol addresses are the points at which the section is cleaved into discrete Atoms. Some file formats (like ELF) also include the length of each symbol in the symbol table. Otherwise, the length of each Atom is calculated to run to the start of the next symbol or the end of the section.

Other sections types can be implicitly cleaved. For instance c-string literals or unwind info (e.g. .eh_frame) can be cleaved by having the Reader look at the content of the section. It is important to cleave sections into Atoms to remove false dependencies. For instance the .eh_frame section often has no symbols, but contains “pointers” to the functions for which it has unwind info. If the .eh_frame section was not cleaved (but left as one big Atom), there would always be a reference (from the eh_frame Atom) to each function. So the linker would be unable to coalesce or dead stripped away the function atoms.

The lld Atom model also requires that a reference to an undefined symbol be modeled as a Reference to an UndefinedAtom. So the Reader also needs to create an UndefinedAtom for each undefined symbol in the object file.

Once all Atoms have been created, the second step is to create References (recall that Atoms are “nodes” and References are “edges”). Most References are created by looking at the “relocation records” in the object file. If a function contains a call to “malloc”, there is usually a relocation record specifying the address in the section and the symbol table index. Your Reader will need to convert the address to an Atom and offset and the symbol table index into a target Atom. If “malloc” is not defined in the object file, the target Atom of the Reference will be an UndefinedAtom.

Performance

Once you have the above working to parse an object file into Atoms and References, you’ll want to look at performance. Some techniques that can help performance are:

  • Use llvm::BumpPtrAllocator or pre-allocate one big vector<Reference> and then just have each atom point to its subrange of References in that vector. This can be faster that allocating each Reference as separate object.
  • Pre-scan the symbol table and determine how many atoms are in each section then allocate space for all the Atom objects at once.
  • Don’t copy symbol names or section content to each Atom, instead use StringRef and ArrayRef in each Atom to point to its name and content in the MemoryBuffer.
Testing

We are still working on infrastructure to test Readers. The issue is that you don’t want to check in binary files to the test suite. And the tools for creating your object file from assembly source may not be available on every OS.

We are investigating a way to use YAML to describe the section, symbols, and content of a file. Then have some code which will write out an object file from that YAML description.

Once that is in place, you can write test cases that contain section/symbols YAML and is run through the linker to produce Atom/References based YAML which is then run through FileCheck to verify the Atoms and References are as expected.

Driver

Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.

Introduction

This document describes the lld driver. The purpose of this document is to describe both the motivation and design goals for the driver, as well as details of the internal implementation.

Overview

The lld driver is designed to support a number of different command line interfaces. The main interfaces we plan to support are binutils’ ld, Apple’s ld, and Microsoft’s link.exe.

Flavors

Each of these different interfaces is referred to as a flavor. There is also an extra flavor “core” which is used to exercise the core functionality of the linker it the test suite.

  • gnu
  • darwin
  • link
  • core
Selecting a Flavor

There are two different ways to tell lld which flavor to be. They are checked in order, so the second overrides the first. The first is to symlink lld as lld-{flavor} or just {flavor}. You can also specify it as the first command line argument using -flavor:

$ lld -flavor gnu

There is a shortcut for -flavor core as -core.

Adding an Option to an existing Flavor
  1. Add the option to the desired lib/Driver/flavorOptions.td.
  2. Add to lld::FlavorLinkingContext a getter and setter method for the option.
  3. Modify lld::FlavorDriver::parse() in :file: lib/Driver/{Flavor}Driver.cpp to call the targetInfo setter for corresponding to the option.
  4. Modify {Flavor}Reader and {Flavor}Writer to use the new targtInfo option.
Adding a Flavor
  1. Add an entry for the flavor in include/lld/Common/Driver.h to lld::UniversalDriver::Flavor.
  2. Add an entry in lib/Driver/UniversalDriver.cpp to lld::Driver::strToFlavor() and lld::UniversalDriver::link(). This allows the flavor to be selected via symlink and -flavor.
  3. Add a tablegen file called lib/Driver/flavorOptions.td that describes the options. If the options are a superset of another driver, that driver’s td file can simply be included. The flavorOptions.td file must also be added to lib/Driver/CMakeLists.txt.
  4. Add a {flavor}Driver as a subclass of lld::Driver in lib/Driver/flavorDriver.cpp.
Open Projects
include/lld/Core
  • The yaml reader/writer interfaces should be changed to return an explanatory string if there is an error. The existing error_code abstraction only works for returning low level OS errors. It does not work for describing formatting issues.
  • We need to add more attributes to File. In particular, we need cpu and OS information (like target triples). We should also provide explicit support for LLVM IR module flags metadata.
Documentation TODOs
Sphinx Introduction for LLVM Developers

This document is intended as a short and simple introduction to the Sphinx documentation generation system for LLVM developers.

Quickstart

To get started writing documentation, you will need to:

  1. Have the Sphinx tools installed.
  2. Understand how to build the documentation.
  3. Start writing documentation!
Installing Sphinx

You should be able to install Sphinx using the standard Python package installation tool easy_install, as follows:

$ sudo easy_install sphinx
Searching for sphinx
Reading http://pypi.python.org/simple/sphinx/
Reading http://sphinx.pocoo.org/
Best match: Sphinx 1.1.3
... more lines here ..

If you do not have root access (or otherwise want to avoid installing Sphinx in system directories) see the section on Installing Sphinx in a Virtual Environment .

If you do not have the easy_install tool on your system, you should be able to install it using:

Linux
Use your distribution’s standard package management tool to install it, i.e., apt-get install easy_install or yum install easy_install.
Mac OS X
All modern Mac OS X systems come with easy_install as part of the base system.
Windows
See the setuptools package web page for instructions.
Building the documentation

In order to build the documentation need to add -DLLVM_ENABLE_SPHINX=ON to your cmake command. Once you do this you can build the docs using docs-lld-html build (ninja or make) target.

That build target will invoke sphinx-build with the appropriate options for the project, and generate the HTML documentation in a tools/lld/docs/html subdirectory.

Writing documentation

The documentation itself is written in the reStructuredText (ReST) format, and Sphinx defines additional tags to support features like cross-referencing.

The ReST format itself is organized around documents mostly being readable plaintext documents. You should generally be able to write new documentation easily just by following the style of the existing documentation.

If you want to understand the formatting of the documents more, the best place to start is Sphinx’s own ReST Primer.

Learning More

If you want to learn more about the Sphinx system, the best place to start is the Sphinx documentation itself, available here.

Installing Sphinx in a Virtual Environment

Most Python developers prefer to work with tools inside a virtualenv (virtual environment) instance, which functions as an application sandbox. This avoids polluting your system installation with different packages used by various projects (and ensures that dependencies for different packages don’t conflict with one another). Of course, you need to first have the virtualenv software itself which generally would be installed at the system level:

$ sudo easy_install virtualenv

but after that you no longer need to install additional packages in the system directories.

Once you have the virtualenv tool itself installed, you can create a virtualenv for Sphinx using:

$ virtualenv ~/my-sphinx-install
New python executable in /Users/dummy/my-sphinx-install/bin/python
Installing setuptools............done.
Installing pip...............done.

$ ~/my-sphinx-install/bin/easy_install sphinx
... install messages here ...

and from now on you can “activate” the virtualenv using:

$ source ~/my-sphinx-install/bin/activate

which will change your PATH to ensure the sphinx-build tool from inside the virtual environment will be used. See the virtualenv website for more information on using virtual environments.

Indices and tables

WebAssembly lld port

Note: The WebAssembly port is still a work in progress and is be lacking certain features.

The WebAssembly version of lld takes WebAssembly binaries as inputs and produces a WebAssembly binary as its output. For the most part this port tried to mimic the behaviour of traditional ELF linkers and specifically the ELF lld port. Where possible that command line flags and the semantics should be the same.

Object file format

The format the input object files that lld expects is specified as part of the the WebAssembly tool conventions https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md.

This is object format that the llvm will produce when run with the wasm32-unknown-unknown target. To build llvm with WebAssembly support currently requires enabling the experimental backed using -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=WebAssembly.

Missing features

There are several key features that are not yet implement in the WebAssembly ports:

  • COMDAT support. This means that support for C++ is still very limited.
  • Function stripping. Currently there is no support for --gc-sections so functions and data from a given object will linked as a unit.
  • Section start/end symbols. The synthetic symbols that mark the start and of data regions are not yet created in the output file.

Windows support

LLD supports Windows operating system. When invoked as lld-link.exe or with -flavor link, the driver for Windows operating system is used to parse command line options, and it drives further linking processes. LLD accepts almost all command line options that the linker shipped with Microsoft Visual C++ (link.exe) supports.

The current status is that LLD can link itself on Windows x86/x64 using Visual C++ 2013 as the compiler.

Development status

Driver
Mostly done. Some exotic command line options that are not usually used for application develompent, such as /DRIVER, are not supported.
Linking against DLL
Done. LLD can read import libraries needed to link against DLL. Both export-by-name and export-by-ordinal are supported.
Linking against static library
Done. The format of static library (.lib) on Windows is actually the same as on Unix (.a). LLD can read it.
Creating DLL
Done. LLD creates a DLL if /DLL option is given. Exported functions can be specified either via command line (/EXPORT) or via module-definition file (.def). Both export-by-name and export-by-ordinal are supported.
Windows resource files support
Done. If an .res file is given, LLD converts the file to a COFF file using LLVM’s Object library.
Safe Structured Exception Handler (SEH)
Done for both x86 and x64.
Module-definition file
Partially done. LLD currently recognizes these directives: EXPORTS, HEAPSIZE, STACKSIZE, NAME, and VERSION.
Debug info
Done. LLD can emit PDBs that are at parity with those generated by link.exe. However, LLD does not support /DEBUG:FASTLINK.

Downloading LLD

The Windows version of LLD is included in the pre-built binaries of LLVM’s releases and in the LLVM Snapshot Builds.

Building LLD

Using Visual Studio IDE/MSBuild
  1. Check out LLVM and LLD from the LLVM SVN repository (or Git mirror),
  2. run cmake -G "Visual Studio 12" <llvm-source-dir> from VS command prompt,
  3. open LLVM.sln with Visual Studio, and
  4. build lld target in lld executables folder

Alternatively, you can use msbuild if you don’t like to work in an IDE:

msbuild LLVM.sln /m /target:"lld executables\lld"

MSBuild.exe had been shipped as a component of the .NET framework, but since 2013 it’s part of Visual Studio. You can find it at “C:\Program Files (x86)\msbuild”.

You can build LLD as a 64 bit application. To do that, open VS2013 x64 command prompt and run cmake for “Visual Studio 12 Win64” target.

Using Ninja
  1. Check out LLVM and LLD from the LLVM SVN repository (or Git mirror),
  2. run cmake -G ninja <llvm-source-dir> from VS command prompt,
  3. run ninja lld

LLD 8.0.0 Release Notes

Warning

These are in-progress notes for the upcoming LLVM 8.0.0 release. Release notes for previous releases can be found on the Download Page.

Introduction

This document contains the release notes for the lld linker, release 8.0.0. Here we describe the status of lld, including major improvements from the previous release. All lld releases may be downloaded from the LLVM releases web site.