LLD - The LLVM Linker¶
LLD is a linker from the LLVM project. That is a drop-in replacement for system linkers and runs much faster than them. It also provides features that are useful for toolchain developers.
The linker supports ELF (Unix), PE/COFF (Windows), Mach-O (macOS) and WebAssembly in descending order of completeness. Internally, LLD consists of several different linkers. The ELF port is the one that will be described in this document. The PE/COFF port is complete, including Windows debug info (PDB) support. The WebAssembly port is still a work in progress (See WebAssembly lld port). The Mach-O port is built based on a different architecture than the others. For the details about Mach-O, please read ATOM-based lld.
Features¶
LLD is a drop-in replacement for the GNU linkers. That accepts the same command line arguments and linker scripts as GNU.
We are currently working closely with the FreeBSD project to make LLD default system linker in future versions of the operating system, so we are serious about addressing compatibility issues. As of February 2017, LLD is able to link the entire FreeBSD/amd64 base system including the kernel. With a few work-in-progress patches it can link approximately 95% of the ports collection on AMD64. For the details, see FreeBSD quarterly status report.
LLD is very fast. When you link a large program on a multicore machine, you can expect that LLD runs more than twice as fast as GNU gold linker. Your milage may vary, though.
It supports various CPUs/ABIs including x86-64, x86, x32, AArch64, ARM, MIPS 32/64 big/little-endian, PowerPC, PowerPC 64 and AMDGPU. Among these, x86-64 is the most well-supported target and have reached production quality. AArch64 and MIPS seem decent too. x86 should be OK but not well tested yet. ARM support is being developed actively.
It is always a cross-linker, meaning that it always supports all the above targets however it was built. In fact, we don’t provide a build-time option to enable/disable each target. This should make it easy to use our linker as part of a cross-compile toolchain.
You can embed LLD to your program to eliminate dependency to external linkers. All you have to do is to construct object files and command line arguments just like you would do to invoke an external linker and then call the linker’s main function,
lld::elf::link
, from your code.It is small. We are using LLVM libObject library to read from object files, so it is not completely a fair comparison, but as of February 2017, LLD/ELF consists only of 21k lines of C++ code while GNU gold consists of 198k lines of C++ code.
Link-time optimization (LTO) is supported by default. Essentially, all you have to do to do LTO is to pass the
-flto
option to clang. Then clang creates object files not in the native object file format but in LLVM bitcode format. LLD reads bitcode object files, compile them using LLVM and emit an output file. Because in this way LLD can see the entire program, it can do the whole program optimization.Some very old features for ancient Unix systems (pre-90s or even before that) have been removed. Some default settings have been tuned for the 21st century. For example, the stack is marked as non-executable by default to tighten security.
Performance¶
This is a link time comparison on a 2-socket 20-core 40-thread Xeon
E5-2680 2.80 GHz machine with an SSD drive. We ran gold and lld with
or without multi-threading support. To disable multi-threading, we
added -no-threads
to the command lines.
Program | Output size | GNU ld | GNU gold w/o threads | GNU gold w/threads | lld w/o threads | lld w/threads |
ffmpeg dbg | 92 MiB | 1.72s | 1.16s | 1.01s | 0.60s | 0.35s |
mysqld dbg | 154 MiB | 8.50s | 2.96s | 2.68s | 1.06s | 0.68s |
clang dbg | 1.67 GiB | 104.03s | 34.18s | 23.49s | 14.82s | 5.28s |
chromium dbg | 1.14 GiB | 209.05s [1] | 64.70s | 60.82s | 27.60s | 16.70s |
As you can see, lld is significantly faster than GNU linkers. Note that this is just a benchmark result of our environment. Depending on number of available cores, available amount of memory or disk latency/throughput, your results may vary.
[1] | Since GNU ld doesn’t support the -icf=all and
-gdb-index options, we removed them from the command line
for GNU ld. GNU ld would have been slower than this if it had
these options. |
Build¶
If you have already checked out LLVM using SVN, you can check out LLD
under tools
directory just like you probably did for clang. For the
details, see Getting Started with the LLVM System.
If you haven’t checkout out LLVM, the easiest way to build LLD is to checkout the entire LLVM projects/sub-projects from a git mirror and build that tree. You need cmake and of course a C++ compiler.
$ git clone https://github.com/llvm-project/llvm-project-20170507 llvm-project
$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=lld -DCMAKE_INSTALL_PREFIX=/usr/local ../llvm-project/llvm
$ make install
Using LLD¶
LLD is installed as ld.lld
. On Unix, linkers are invoked by
compiler drivers, so you are not expected to use that command
directly. There are a few ways to tell compiler drivers to use ld.lld
instead of the default linker.
The easiest way to do that is to overwrite the default linker. After
installing LLD to somewhere on your disk, you can create a symbolic
link by doing ln -s /path/to/ld.lld /usr/bin/ld
so that
/usr/bin/ld
is resolved to LLD.
If you don’t want to change the system setting, you can use clang’s
-fuse-ld
option. In this way, you want to set -fuse-ld=lld
to
LDFLAGS when building your programs.
LLD leaves its name and version number to a .comment
section in an
output. If you are in doubt whether you are successfully using LLD or
not, run readelf --string-dump .comment <output-file>
and examine the
output. If the string “Linker: LLD” is included in the output, you are
using LLD.
History¶
Here is a brief project history of the ELF and COFF ports.
- May 2015: We decided to rewrite the COFF linker and did that. Noticed that the new linker is much faster than the MSVC linker.
- July 2015: The new ELF port was developed based on the COFF linker architecture.
- September 2015: The first patches to support MIPS and AArch64 landed.
- October 2015: Succeeded to self-host the ELF port. We have noticed that the linker was faster than the GNU linkers, but we weren’t sure at the time if we would be able to keep the gap as we would add more features to the linker.
- July 2016: Started working on improving the linker script support.
- December 2016: Succeeded to build the entire FreeBSD base system including the kernel. We had widen the performance gap against the GNU linkers.
Internals¶
For the internals of the linker, please read The ELF, COFF and Wasm Linkers. It is a bit outdated but the fundamental concepts remain valid. We’ll update the document soon.
The ELF, COFF and Wasm Linkers¶
The ELF Linker as a Library¶
You can embed LLD to your program by linking against it and calling the linker’s entry point function lld::elf::link.
The current policy is that it is your reponsibility to give trustworthy object files. The function is guaranteed to return as long as you do not pass corrupted or malicious object files. A corrupted file could cause a fatal error or SEGV. That being said, you don’t need to worry too much about it if you create object files in the usual way and give them to the linker. It is naturally expected to work, or otherwise it’s a linker’s bug.
Design¶
We will describe the design of the linkers in the rest of the document.
Key Concepts¶
Linkers are fairly large pieces of software. There are many design choices you have to make to create a complete linker.
This is a list of design choices we’ve made for ELF and COFF LLD. We believe that these high-level design choices achieved a right balance between speed, simplicity and extensibility.
Implement as native linkers
We implemented the linkers as native linkers for each file format.
The linkers share the same design but share very little code. Sharing code makes sense if the benefit is worth its cost. In our case, the object formats are different enough that we thought the layer to abstract the differences wouldn’t be worth its complexity and run-time cost. Elimination of the abstract layer has greatly simplified the implementation.
Speed by design
One of the most important things in archiving high performance is to do less rather than do it efficiently. Therefore, the high-level design matters more than local optimizations. Since we are trying to create a high-performance linker, it is very important to keep the design as efficient as possible.
Broadly speaking, we do not do anything until we have to do it. For example, we do not read section contents or relocations until we need them to continue linking. When we need to do some costly operation (such as looking up a hash table for each symbol), we do it only once. We obtain a handler (which is typically just a pointer to actual data) on the first operation and use it throughout the process.
Efficient archive file handling
LLD’s handling of archive files (the files with “.a” file extension) is different from the traditional Unix linkers and similar to Windows linkers. We’ll describe how the traditional Unix linker handles archive files, what the problem is, and how LLD approached the problem.
The traditional Unix linker maintains a set of undefined symbols during linking. The linker visits each file in the order as they appeared in the command line until the set becomes empty. What the linker would do depends on file type.
- If the linker visits an object file, the linker links object files to the result, and undefined symbols in the object file are added to the set.
- If the linker visits an archive file, it checks for the archive file’s symbol table and extracts all object files that have definitions for any symbols in the set.
This algorithm sometimes leads to a counter-intuitive behavior. If you give archive files before object files, nothing will happen because when the linker visits archives, there is no undefined symbols in the set. As a result, no files are extracted from the first archive file, and the link is done at that point because the set is empty after it visits one file.
You can fix the problem by reordering the files, but that cannot fix the issue of mutually-dependent archive files.
Linking mutually-dependent archive files is tricky. You may specify the same archive file multiple times to let the linker visit it more than once. Or, you may use the special command line options, –start-group and –end-group, to let the linker loop over the files between the options until no new symbols are added to the set.
Visiting the same archive files multiple makes the linker slower.
Here is how LLD approaches the problem. Instead of memorizing only undefined symbols, we program LLD so that it memorizes all symbols. When it sees an undefined symbol that can be resolved by extracting an object file from an archive file it previously visited, it immediately extracts the file and link it. It is doable because LLD does not forget symbols it have seen in archive files.
We believe that the LLD’s way is efficient and easy to justify.
The semantics of LLD’s archive handling is different from the traditional Unix’s. You can observe it if you carefully craft archive files to exploit it. However, in reality, we don’t know any program that cannot link with our algorithm so far, so it’s not going to cause trouble.
Numbers You Want to Know¶
To give you intuition about what kinds of data the linker is mainly working on, I’ll give you the list of objects and their numbers LLD has to read and process in order to link a very large executable. In order to link Chrome with debug info, which is roughly 2 GB in output size, LLD reads
- 17,000 files,
- 1,800,000 sections,
- 6,300,000 symbols, and
- 13,000,000 relocations.
LLD produces the 2 GB executable in 15 seconds.
These numbers vary depending on your program, but in general, you have a lot of relocations and symbols for each file. If your program is written in C++, symbol names are likely to be pretty long because of name mangling.
It is important to not waste time on relocations and symbols.
In the above case, the total amount of symbol strings is 450 MB, and inserting all of them to a hash table takes 1.5 seconds. Therefore, if you causally add a hash table lookup for each symbol, it would slow down the linker by 10%. So, don’t do that.
On the other hand, you don’t have to pursue efficiency when handling files.
Important Data Structures¶
We will describe the key data structures in LLD in this section. The linker can be understood as the interactions between them. Once you understand their functions, the code of the linker should look obvious to you.
Symbol
This class represents a symbol. They are created for symbols in object files or archive files. The linker creates linker-defined symbols as well.
There are basically three types of Symbols: Defined, Undefined, or Lazy.
- Defined symbols are for all symbols that are considered as “resolved”, including real defined symbols, COMDAT symbols, common symbols, absolute symbols, linker-created symbols, etc.
- Undefined symbols represent undefined symbols, which need to be replaced by Defined symbols by the resolver until the link is complete.
- Lazy symbols represent symbols we found in archive file headers which can turn into Defined if we read archieve members.
There’s only one Symbol instance for each unique symbol name. This uniqueness is guaranteed by the symbol table. As the resolver reads symbols from input files, it replaces an existing Symbol with the “best” Symbol for its symbol name using the placement new.
The above mechanism allows you to use pointers to Symbols as a very cheap way to access name resolution results. Assume for example that you have a pointer to an undefined symbol before name resolution. If the symbol is resolved to a defined symbol by the resolver, the pointer will “automatically” point to the defined symbol, because the undefined symbol the pointer pointed to will have been replaced by the defined symbol in-place.
SymbolTable
SymbolTable is basically a hash table from strings to Symbols with logic to resolve symbol conflicts. It resolves conflicts by symbol type.
- If we add Defined and Undefined symbols, the symbol table will keep the former.
- If we add Defined and Lazy symbols, it will keep the former.
- If we add Lazy and Undefined, it will keep the former, but it will also trigger the Lazy symbol to load the archive member to actually resolve the symbol.
Chunk (COFF specific)
Chunk represents a chunk of data that will occupy space in an output. Each regular section becomes a chunk. Chunks created for common or BSS symbols are not backed by sections. The linker may create chunks to append additional data to an output as well.
Chunks know about their size, how to copy their data to mmap’ed outputs, and how to apply relocations to them. Specifically, section-based chunks know how to read relocation tables and how to apply them.
InputSection (ELF specific)
Since we have less synthesized data for ELF, we don’t abstract slices of input files as Chunks for ELF. Instead, we directly use the input section as an internal data type.
InputSection knows about their size and how to copy themselves to mmap’ed outputs, just like COFF Chunks.
OutputSection
OutputSection is a container of InputSections (ELF) or Chunks (COFF). An InputSection or Chunk belongs to at most one OutputSection.
There are mainly three actors in this linker.
InputFile
InputFile is a superclass of file readers. We have a different subclass for each input file type, such as regular object file, archive file, etc. They are responsible for creating and owning Symbols and InputSections/Chunks.
Writer
The writer is responsible for writing file headers and InputSections/Chunks to a file. It creates OutputSections, put all InputSections/Chunks into them, assign unique, non-overlapping addresses and file offsets to them, and then write them down to a file.
Driver
The linking process is driven by the driver. The driver:
- processes command line options,
- creates a symbol table,
- creates an InputFile for each input file and puts all symbols within into the symbol table,
- checks if there’s no remaining undefined symbols,
- creates a writer,
- and passes the symbol table to the writer to write the result to a file.
Link-Time Optimization¶
LTO is implemented by handling LLVM bitcode files as object files. The linker resolves symbols in bitcode files normally. If all symbols are successfully resolved, it then runs LLVM passes with all bitcode files to convert them to one big regular ELF/COFF file. Finally, the linker replaces bitcode symbols with ELF/COFF symbols, so that they are linked as if they were in the native format from the beginning.
The details are described in this document. http://llvm.org/docs/LinkTimeOptimization.html
Glossary¶
RVA (COFF)
Short for Relative Virtual Address.
Windows executables or DLLs are not position-independent; they are linked against a fixed address called an image base. RVAs are offsets from an image base.
Default image bases are 0x140000000 for executables and 0x18000000 for DLLs. For example, when we are creating an executable, we assume that the executable will be loaded at address 0x140000000 by the loader, so we apply relocations accordingly. Result texts and data will contain raw absolute addresses.
VA
Short for Virtual Address. For COFF, it is equivalent to RVA + image base.
Base relocations (COFF)
Relocation information for the loader. If the loader decides to map an executable or a DLL to a different address than their image bases, it fixes up binaries using information contained in the base relocation table. A base relocation table consists of a list of locations containing addresses. The loader adds a difference between RVA and actual load address to all locations listed there.
Note that this run-time relocation mechanism is much simpler than ELF. There’s no PLT or GOT. Images are relocated as a whole just by shifting entire images in memory by some offsets. Although doing this breaks text sharing, I think this mechanism is not actually bad on today’s computers.
ICF
Short for Identical COMDAT Folding (COFF) or Identical Code Folding (ELF).
ICF is an optimization to reduce output size by merging read-only sections by not only their names but by their contents. If two read-only sections happen to have the same metadata, actual contents and relocations, they are merged by ICF. It is known as an effective technique, and it usually reduces C++ program’s size by a few percent or more.
Note that this is not an entirely sound optimization. C/C++ require different functions have different addresses. If a program depends on that property, it would fail at runtime.
On Windows, that’s not really an issue because MSVC link.exe enabled the optimization by default. As long as your program works with the linker’s default settings, your program should be safe with ICF.
On Unix, your program is generally not guaranteed to be safe with ICF, although large programs happen to work correctly. LLD works fine with ICF for example.
ATOM-based lld¶
Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.
ATOM-based lld is a new set of modular code for creating linker tools. Currently it supports Mach-O.
- End-User Features:
- Compatible with existing linker options
- Reads standard Object Files
- Writes standard Executable Files
- Remove clang’s reliance on “the system linker”
- Uses the LLVM “UIUC” BSD-Style license.
- Applications:
- Modular design
- Support cross linking
- Easy to add new CPU support
- Can be built as static tool or library
- Design and Implementation:
- Extensive unit tests
- Internal linker model can be dumped/read to textual format
- Additional linking features can be plugged in as “passes”
- OS specific and CPU specific code factored out
Why a new linker?¶
The fact that clang relies on whatever linker tool you happen to have installed means that clang has been very conservative adopting features which require a recent linker.
In the same way that the MC layer of LLVM has removed clang’s reliance on the system assembler tool, the lld project will remove clang’s reliance on the system linker tool.
Contents¶
Linker Design¶
Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.
Introduction¶
lld is a new generation of linker. It is not “section” based like traditional linkers which mostly just interlace sections from multiple object files into the output file. Instead, lld is based on “Atoms”. Traditional section based linking work well for simple linking, but their model makes advanced linking features difficult to implement. Features like dead code stripping, reordering functions for locality, and C++ coalescing require the linker to work at a finer grain.
An atom is an indivisible chunk of code or data. An atom has a set of attributes, such as: name, scope, content-type, alignment, etc. An atom also has a list of References. A Reference contains: a kind, an optional offset, an optional addend, and an optional target atom.
The Atom model allows the linker to use standard graph theory models for linking data structures. Each atom is a node, and each Reference is an edge. The feature of dead code stripping is implemented by following edges to mark all live atoms, and then delete the non-live atoms.
Atom Model¶
An atom is an indivisible chunk of code or data. Typically each user written function or global variable is an atom. In addition, the compiler may emit other atoms, such as for literal c-strings or floating point constants, or for runtime data structures like dwarf unwind info or pointers to initializers.
A simple “hello world” object file would be modeled like this:

There are three atoms: main, a proxy for printf, and an anonymous atom containing the c-string literal “hello world”. The Atom “main” has two references. One is the call site for the call to printf, and the other is a reference for the instruction that loads the address of the c-string literal.
There are only four different types of atoms:
- DefinedAtom
95% of all atoms. This is a chunk of code or data
- UndefinedAtom
This is a place holder in object files for a reference to some atom outside the translation unit.During core linking it is usually replaced by (coalesced into) another Atom.
- SharedLibraryAtom
If a required symbol name turns out to be defined in a dynamic shared library (and not some object file). A SharedLibraryAtom is the placeholder Atom used to represent that fact.
It is similar to an UndefinedAtom, but it also tracks information about the associated shared library.
- AbsoluteAtom
This is for embedded support where some stuff is implemented in ROM at some fixed address. This atom has no content. It is just an address that the Writer needs to fix up any references to point to.
File Model¶
The linker views the input files as basically containers of Atoms and References, and just a few attributes of their own. The linker works with three kinds of files: object files, static libraries, and dynamic shared libraries. Each kind of file has reader object which presents the file in the model expected by the linker.
An object file is just a container of atoms. When linking an object file, a reader is instantiated which parses the object file and instantiates a set of atoms representing all content in the .o file. The linker adds all those atoms to a master graph.
This is the traditional unix static archive which is just a collection of object files with a “table of contents”. When linking with a static library, by default nothing is added to the master graph of atoms. Instead, if after merging all atoms from object files into a master graph, if any “undefined” atoms are left remaining in the master graph, the linker reads the table of contents for each static library to see if any have the needed definitions. If so, the set of atoms from the specified object file in the static library is added to the master graph of atoms.
Linking Steps¶
Through the use of abstract Atoms, the core of linking is architecture independent and file format independent. All command line parsing is factored out into a separate “options” abstraction which enables the linker to be driven with different command line sets.
The overall steps in linking are:
- Command line processing
- Parsing input files
- Resolving
- Passes/Optimizations
- Generate output file
The Resolving and Passes steps are done purely on the master graph of atoms, so they have no notion of file formats such as mach-o or ELF.
Existing developer tools using different file formats for object files. A goal of lld is to be file format independent. This is done through a plug-in model for reading object files. The lld::Reader is the base class for all object file readers. A Reader follows the factory method pattern. A Reader instantiates an lld::File object (which is a graph of Atoms) from a given object file (on disk or in-memory).
Every Reader subclass defines its own “options” class (for instance the mach-o Reader defines the class ReaderOptionsMachO). This options class is the one-and-only way to control how the Reader operates when parsing an input file into an Atom graph. For instance, you may want the Reader to only accept certain architectures. The options class can be instantiated from command line options, or it can be subclassed and the ivars programmatically set.
The resolving step takes all the atoms’ graphs from each object file and combines them into one master object graph. Unfortunately, it is not as simple as appending the atom list from each file into one big list. There are many cases where atoms need to be coalesced. That is, two or more atoms need to be coalesced into one atom. This is necessary to support: C language “tentative definitions”, C++ weak symbols for templates and inlines defined in headers, replacing undefined atoms with actual definition atoms, and for merging copies of constants like c-strings and floating point constants.
The linker support coalescing by-name and by-content. By-name is used for tentative definitions and weak symbols. By-content is used for constant data that can be merged.
The resolving process maintains some global linking “state”, including a “symbol table” which is a map from llvm::StringRef to lld::Atom*. With these data structures, the linker iterates all atoms in all input files. For each atom, it checks if the atom is named and has a global or hidden scope. If so, the atom is added to the symbol table map. If there already is a matching atom in that table, that means the current atom needs to be coalesced with the found atom, or it is a multiple definition error.
When all initial input file atoms have been processed by the resolver, a scan is made to see if there are any undefined atoms in the graph. If there are, the linker scans all libraries (both static and dynamic) looking for definitions to replace the undefined atoms. It is an error if any undefined atoms are left remaining.
Dead code stripping (if requested) is done at the end of resolving. The linker does a simple mark-and-sweep. It starts with “root” atoms (like “main” in a main executable) and follows each references and marks each Atom that it visits as “live”. When done, all atoms not marked “live” are removed.
The result of the Resolving phase is the creation of an lld::File object. The goal is that the lld::File model is the internal representation throughout the linker. The file readers parse (mach-o, ELF, COFF) into an lld::File. The file writers (mach-o, ELF, COFF) taken an lld::File and produce their file kind, and every Pass only operates on an lld::File. This is not only a simpler, consistent model, but it enables the state of the linker to be dumped at any point in the link for testing purposes.
The Passes step is an open ended set of routines that each get a change to modify or enhance the current lld::File object. Some example Passes are:
- stub (PLT) generation
- GOT instantiation
- order_file optimization
- branch island generation
- branch shim generation
- Objective-C optimizations (Darwin specific)
- TLV instantiation (Darwin specific)
- DTrace probe processing (Darwin specific)
- compact unwind encoding (Darwin specific)
Some of these passes are specific to Darwin’s runtime environments. But many of the passes are applicable to any OS (such as generating branch island for out of range branch instructions).
The general structure of a pass is to iterate through the atoms in the current lld::File object, inspecting each atom and doing something. For instance, the stub pass, looks for call sites to shared library atoms (e.g. call to printf). It then instantiates a “stub” atom (PLT entry) and a “lazy pointer” atom for each proxy atom needed, and these new atoms are added to the current lld::File object. Next, all the noted call sites to shared library atoms have their References altered to point to the stub atom instead of the shared library atom.
Once the passes are done, the output file writer is given current lld::File object. The writer’s job is to create the executable content file wrapper and place the content of the atoms into it.
lld uses a plug-in model for writing output files. All concrete writers (e.g. ELF, mach-o, etc) are subclasses of the lld::Writer class.
Unlike the Reader class which has just one method to instantiate an lld::File, the Writer class has multiple methods. The crucial method is to generate the output file, but there are also methods which allow the Writer to contribute Atoms to the resolver and specify passes to run.
An example of contributing atoms is that if the Writer knows a main executable is being linked and such an executable requires a specially named entry point (e.g. “_main”), the Writer can add an UndefinedAtom with that special name to the resolver. This will cause the resolver to issue an error if that symbol is not defined.
Sometimes a Writer supports lazily created symbols, such as names for the start of sections. To support this, the Writer can create a File object which vends no initial atoms, but does lazily supply atoms by name as needed.
Every Writer subclass defines its own “options” class (for instance the mach-o Writer defines the class WriterOptionsMachO). This options class is the one-and-only way to control how the Writer operates when producing an output file from an Atom graph. For instance, you may want the Writer to optimize the output for certain OS versions, or strip local symbols, etc. The options class can be instantiated from command line options, or it can be subclassed and the ivars programmatically set.
lld::File representations¶
Just as LLVM has three representations of its IR model, lld has two representations of its File/Atom/Reference model:
- In memory, abstract C++ classes (lld::Atom, lld::Reference, and lld::File).
- textual (in YAML)
In designing a textual format we want something easy for humans to read and easy for the linker to parse. Since an atom has lots of attributes most of which are usually just the default, we should define default values for every attribute so that those can be omitted from the text representation. Here is the atoms for a simple hello world program expressed in YAML:
target-triple: x86_64-apple-darwin11
atoms:
- name: _main
scope: global
type: code
content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00,
00, 00, 31, c0, 5d, c3 ]
fixups:
- offset: 07
kind: pcrel32
target: 2
- offset: 0E
kind: call32
target: _fprintf
- type: c-string
content: [ 73, 5A, 00 ]
...
The biggest use for the textual format will be writing test cases. Writing test cases in C is problematic because the compiler may vary its output over time for its own optimization reasons which my inadvertently disable or break the linker feature trying to be tested. By writing test cases in the linkers own textual format, we can exactly specify every attribute of every atom and thus target specific linker logic.
The textual/YAML format follows the ReaderWriter patterns used in lld. The lld library comes with the classes: ReaderYAML and WriterYAML.
Testing¶
The lld project contains a test suite which is being built up as new code is added to lld. All new lld functionality should have a tests added to the test suite. The test suite is lit driven. Each test is a text file with comments telling lit how to run the test and check the result To facilitate testing, the lld project builds a tool called lld-core. This tool reads a YAML file (default from stdin), parses it into one or more lld::File objects in memory and then feeds those lld::File objects to the resolver phase.
Basic testing is the “core linking” or resolving phase. That is where the linker merges object files. All test cases are written in YAML. One feature of YAML is that it allows multiple “documents” to be encoding in one YAML stream. That means one text file can appear to the linker as multiple .o files - the normal case for the linker.
Here is a simple example of a core linking test case. It checks that an undefined atom from one file will be replaced by a definition from another file:
# RUN: lld-core %s | FileCheck %s
#
# Test that undefined atoms are replaced with defined atoms.
#
---
atoms:
- name: foo
definition: undefined
---
atoms:
- name: foo
scope: global
type: code
...
# CHECK: name: foo
# CHECK: scope: global
# CHECK: type: code
# CHECK-NOT: name: foo
# CHECK: ...
Since Passes just operate on an lld::File object, the lld-core tool has the option to run a particular pass (after resolving). Thus, you can write a YAML test case with carefully crafted input to exercise areas of a Pass and the check the resulting lld::File object as represented in YAML.
Design Issues¶
There are a number of open issues in the design of lld. The plan is to wait and make these design decisions when we need to.
Currently, the lld model says nothing about debug info. But the most popular debug format is DWARF and there is some impedance mismatch with the lld model and DWARF. In lld there are just Atoms and only Atoms that need to be in a special section at runtime have an associated section. Also, Atoms do not have addresses. The way DWARF is spec’ed different parts of DWARF are supposed to go into specially named sections and the DWARF references function code by address.
Currently, lld has an abstract “Platform” that deals with any CPU or OS specific differences in linking. We just keep adding virtual methods to the base Platform class as we find linking areas that might need customization. At some point we’ll need to structure this better.
Currently, lld::File just has a path and a way to iterate its atoms. We will need to add more attributes on a File. For example, some equivalent to the target triple. There is also a number of cached or computed attributes that could make various Passes more efficient. For instance, on Darwin there are a number of Objective-C optimizations that can be done by a Pass. But it would improve the plain C case if the Objective-C optimization Pass did not have to scan all atoms looking for any Objective-C data structures. This could be done if the lld::File object had an attribute that said if the file had any Objective-C data in it. The Resolving phase would then be required to “merge” that attribute as object files are added.
Getting Started: Building and Running lld¶
This page gives you the shortest path to checking out and building lld. If you run into problems, please file bugs in the LLVM Bugzilla
Building lld¶
- Get the required tools.
- CMake 2.8+.
- make (or any build system CMake supports).
- Clang 3.1+ or GCC 4.7+ (C++11 support is required).
- If using Clang, you will also need libc++.
- Python 2.4+ (not 3.x) for running tests.
Check out LLVM:
$ cd path/to/llvm-project $ svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
Check out lld:
$ cd llvm/tools $ svn co http://llvm.org/svn/llvm-project/lld/trunk lld
- lld can also be checked out to
path/to/llvm-project
and built as an external project.
Build LLVM and lld:
$ cd path/to/llvm-build/llvm (out of source build required) $ cmake -G "Unix Makefiles" path/to/llvm-project/llvm $ make
If you want to build with clang and it is not the default compiler or it is installed in an alternate location, you’ll need to tell the cmake tool the location of the C and C++ compiler via CMAKE_C_COMPILER and CMAKE_CXX_COMPILER. For example:
$ cmake -DCMAKE_CXX_COMPILER=/path/to/clang++ -DCMAKE_C_COMPILER=/path/to/clang ...
Test:
$ make check-lld
- Get the required tools.
- CMake 2.8+.
- Visual Studio 12 (2013) or later (required for C++11 support)
- Python 2.4+ (not 3.x) for running tests.
Check out LLVM:
$ cd path/to/llvm-project $ svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
Check out lld:
$ cd llvm/tools $ svn co http://llvm.org/svn/llvm-project/lld/trunk lld
- lld can also be checked out to
path/to/llvm-project
and built as an external project.
Generate Visual Studio project files:
$ cd path/to/llvm-build/llvm (out of source build required) $ cmake -G "Visual Studio 11" path/to/llvm-project/llvm
Build
- Open LLVM.sln in Visual Studio.
- Build the
ALL_BUILD
target.
- Test
- Build the
lld-test
target.
For more information on using CMake see the LLVM CMake guide.
Development¶
Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.
lld is developed as part of the LLVM project.
Creating a Reader¶
See the Creating a Reader guide.
Debugging¶
You can run lld with -mllvm -debug
command line options to enable debugging
printouts. If you want to enable debug information for some specific pass, you
can run it with -mllvm '-debug-only=<pass>'
, where pass is a name used in
the DEBUG_WITH_TYPE()
macro.
Documentation¶
The project documentation is written in reStructuredText and generated using the Sphinx documentation generator. For more information on writing documentation for the project, see the Sphinx Introduction for LLVM Developers.
Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.
The purpose of a “Reader” is to take an object file in a particular format
and create an lld::File
(which is a graph of Atoms)
representing the object file. A Reader inherits from
lld::Reader
which lives in
include/lld/Core/Reader.h
and
lib/Core/Reader.cpp
.
The Reader infrastructure for an object format Foo
requires the
following pieces in order to fit into lld:
include/lld/ReaderWriter/ReaderFoo.h
- class
ReaderOptionsFoo
: public ReaderOptions¶This Options class is the only way to configure how the Reader will parse any file into an
lld::Reader
object. This class should be declared in thelld
namespace.
- Reader *
createReaderFoo
(ReaderOptionsFoo &reader)¶This factory function configures and create the Reader. This function should be declared in the
lld
namespace.
lib/ReaderWriter/Foo/ReaderFoo.cpp
- class
ReaderFoo
: public Reader¶This is the concrete Reader class which can be called to parse object files. It should be declared in an anonymous namespace or if there is shared code with the
lld::WriterFoo
you can make a nested namespace (e.g.lld::foo
).
You may have noticed that ReaderFoo
is not declared in the
.h
file. An important design aspect of lld is that all Readers are
created only through an object-format-specific
createReaderFoo()
factory function. The creation of the Reader is
parametrized through a ReaderOptionsFoo
class. This options
class is the one-and-only way to control how the Reader operates when
parsing an input file into an Atom graph. For instance, you may want the
Reader to only accept certain architectures. The options class can be
instantiated from command line options or be programmatically configured.
The lld project already has a skeleton of source code for Readers for
ELF
, PECOFF
, MachO
, and lld’s native YAML
graph format.
If your file format is a variant of one of those, you should modify the
existing Reader to support your variant. This is done by customizing the Options
class for the Reader and making appropriate changes to the .cpp
file to
interpret those options and act accordingly.
If your object file format is not a variant of any existing Reader, you’ll need to create a new Reader subclass with the organization described above.
The linker will usually only instantiate your Reader once. That one Reader will have its loadFile() method called many times with different input files. To support multithreaded linking, the Reader may be parsing multiple input files in parallel. Therefore, there should be no parsing state in you Reader object. Any parsing state should be in ivars of your File subclass or in some temporary object.
The key method to implement in a reader is:
virtual error_code loadFile(LinkerInput &input,
std::vector<std::unique_ptr<File>> &result);
It takes a memory buffer (which contains the contents of the object file being read) and returns an instantiated lld::File object which is a collection of Atoms. The result is a vector of File pointers (instead of simple a File pointer) because some file formats allow multiple object “files” to be encoded in one file system file.
Atoms are always owned by their File object. During core linking when Atoms are coalesced or stripped away, core linking does not delete them. Core linking just removes those unused Atoms from its internal list. The destructor of a File object is responsible for deleting all Atoms it owns, and if ownership of the MemoryBuffer was passed to it, the File destructor needs to delete that too.
The internal model of lld is purely Atom based. But most object files do not have an explicit concept of Atoms, instead most have “sections”. The way to think of this is that a section is just a list of Atoms with common attributes.
The first step in parsing section-based object files is to cleave each section into a list of Atoms. The technique may vary by section type. For code sections (e.g. .text), there are usually symbols at the start of each function. Those symbol addresses are the points at which the section is cleaved into discrete Atoms. Some file formats (like ELF) also include the length of each symbol in the symbol table. Otherwise, the length of each Atom is calculated to run to the start of the next symbol or the end of the section.
Other sections types can be implicitly cleaved. For instance c-string literals or unwind info (e.g. .eh_frame) can be cleaved by having the Reader look at the content of the section. It is important to cleave sections into Atoms to remove false dependencies. For instance the .eh_frame section often has no symbols, but contains “pointers” to the functions for which it has unwind info. If the .eh_frame section was not cleaved (but left as one big Atom), there would always be a reference (from the eh_frame Atom) to each function. So the linker would be unable to coalesce or dead stripped away the function atoms.
The lld Atom model also requires that a reference to an undefined symbol be modeled as a Reference to an UndefinedAtom. So the Reader also needs to create an UndefinedAtom for each undefined symbol in the object file.
Once all Atoms have been created, the second step is to create References (recall that Atoms are “nodes” and References are “edges”). Most References are created by looking at the “relocation records” in the object file. If a function contains a call to “malloc”, there is usually a relocation record specifying the address in the section and the symbol table index. Your Reader will need to convert the address to an Atom and offset and the symbol table index into a target Atom. If “malloc” is not defined in the object file, the target Atom of the Reference will be an UndefinedAtom.
Once you have the above working to parse an object file into Atoms and References, you’ll want to look at performance. Some techniques that can help performance are:
- Use llvm::BumpPtrAllocator or pre-allocate one big vector<Reference> and then just have each atom point to its subrange of References in that vector. This can be faster that allocating each Reference as separate object.
- Pre-scan the symbol table and determine how many atoms are in each section then allocate space for all the Atom objects at once.
- Don’t copy symbol names or section content to each Atom, instead use StringRef and ArrayRef in each Atom to point to its name and content in the MemoryBuffer.
We are still working on infrastructure to test Readers. The issue is that you don’t want to check in binary files to the test suite. And the tools for creating your object file from assembly source may not be available on every OS.
We are investigating a way to use YAML to describe the section, symbols, and content of a file. Then have some code which will write out an object file from that YAML description.
Once that is in place, you can write test cases that contain section/symbols YAML and is run through the linker to produce Atom/References based YAML which is then run through FileCheck to verify the Atoms and References are as expected.
Note: this document discuss Mach-O port of LLD. For ELF and COFF, see LLD - The LLVM Linker.
This document describes the lld driver. The purpose of this document is to describe both the motivation and design goals for the driver, as well as details of the internal implementation.
The lld driver is designed to support a number of different command line interfaces. The main interfaces we plan to support are binutils’ ld, Apple’s ld, and Microsoft’s link.exe.
Each of these different interfaces is referred to as a flavor. There is also an extra flavor “core” which is used to exercise the core functionality of the linker it the test suite.
- gnu
- darwin
- link
- core
There are two different ways to tell lld which flavor to be. They are checked in
order, so the second overrides the first. The first is to symlink lld
as lld-{flavor} or just {flavor}. You can also specify
it as the first command line argument using -flavor
:
$ lld -flavor gnu
There is a shortcut for -flavor core
as -core
.
- Add the option to the desired
lib/Driver/flavorOptions.td
. - Add to
lld::FlavorLinkingContext
a getter and setter method for the option. - Modify
lld::FlavorDriver::parse()
in :file: lib/Driver/{Flavor}Driver.cpp to call the targetInfo setter for corresponding to the option. - Modify {Flavor}Reader and {Flavor}Writer to use the new targtInfo option.
- Add an entry for the flavor in
include/lld/Common/Driver.h
tolld::UniversalDriver::Flavor
. - Add an entry in
lib/Driver/UniversalDriver.cpp
tolld::Driver::strToFlavor()
andlld::UniversalDriver::link()
. This allows the flavor to be selected via symlink and -flavor. - Add a tablegen file called
lib/Driver/flavorOptions.td
that describes the options. If the options are a superset of another driver, that driver’s td file can simply be included. TheflavorOptions.td
file must also be added tolib/Driver/CMakeLists.txt
. - Add a
{flavor}Driver
as a subclass oflld::Driver
inlib/Driver/flavorDriver.cpp
.
Open Projects¶
include/lld/Core¶
- The yaml reader/writer interfaces should be changed to return an explanatory string if there is an error. The existing error_code abstraction only works for returning low level OS errors. It does not work for describing formatting issues.
- We need to add more attributes to File. In particular, we need cpu and OS information (like target triples). We should also provide explicit support for LLVM IR module flags metadata.
Documentation TODOs¶
Sphinx Introduction for LLVM Developers¶
This document is intended as a short and simple introduction to the Sphinx documentation generation system for LLVM developers.
Quickstart¶
To get started writing documentation, you will need to:
- Have the Sphinx tools installed.
- Understand how to build the documentation.
- Start writing documentation!
You should be able to install Sphinx using the standard Python package
installation tool easy_install
, as follows:
$ sudo easy_install sphinx
Searching for sphinx
Reading http://pypi.python.org/simple/sphinx/
Reading http://sphinx.pocoo.org/
Best match: Sphinx 1.1.3
... more lines here ..
If you do not have root access (or otherwise want to avoid installing Sphinx in system directories) see the section on Installing Sphinx in a Virtual Environment .
If you do not have the easy_install
tool on your system, you should be able
to install it using:
- Linux
- Use your distribution’s standard package management tool to install it, i.e.,
apt-get install easy_install
oryum install easy_install
.- Mac OS X
- All modern Mac OS X systems come with
easy_install
as part of the base system.- Windows
- See the setuptools package web page for instructions.
In order to build the documentation need to add -DLLVM_ENABLE_SPHINX=ON
to
your cmake
command. Once you do this you can build the docs using
docs-lld-html
build (ninja
or make
) target.
That build target will invoke sphinx-build
with the appropriate options for
the project, and generate the HTML documentation in a tools/lld/docs/html
subdirectory.
The documentation itself is written in the reStructuredText (ReST) format, and Sphinx defines additional tags to support features like cross-referencing.
The ReST format itself is organized around documents mostly being readable plaintext documents. You should generally be able to write new documentation easily just by following the style of the existing documentation.
If you want to understand the formatting of the documents more, the best place to start is Sphinx’s own ReST Primer.
Learning More¶
If you want to learn more about the Sphinx system, the best place to start is the Sphinx documentation itself, available here.
Installing Sphinx in a Virtual Environment¶
Most Python developers prefer to work with tools inside a virtualenv (virtual environment) instance, which functions as an application sandbox. This avoids polluting your system installation with different packages used by various projects (and ensures that dependencies for different packages don’t conflict with one another). Of course, you need to first have the virtualenv software itself which generally would be installed at the system level:
$ sudo easy_install virtualenv
but after that you no longer need to install additional packages in the system directories.
Once you have the virtualenv tool itself installed, you can create a virtualenv for Sphinx using:
$ virtualenv ~/my-sphinx-install
New python executable in /Users/dummy/my-sphinx-install/bin/python
Installing setuptools............done.
Installing pip...............done.
$ ~/my-sphinx-install/bin/easy_install sphinx
... install messages here ...
and from now on you can “activate” the virtualenv using:
$ source ~/my-sphinx-install/bin/activate
which will change your PATH to ensure the sphinx-build tool from inside the virtual environment will be used. See the virtualenv website for more information on using virtual environments.
Indices and tables¶
WebAssembly lld port¶
Note: The WebAssembly port is still a work in progress and is be lacking certain features.
The WebAssembly version of lld takes WebAssembly binaries as inputs and produces a WebAssembly binary as its output. For the most part this port tried to mimic the behaviour of traditional ELF linkers and specifically the ELF lld port. Where possible that command line flags and the semantics should be the same.
Object file format¶
The format the input object files that lld expects is specified as part of the the WebAssembly tool conventions https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md.
This is object format that the llvm will produce when run with the
wasm32-unknown-unknown
target. To build llvm with WebAssembly support
currently requires enabling the experimental backed using
-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=WebAssembly
.
Missing features¶
There are several key features that are not yet implement in the WebAssembly ports:
- COMDAT support. This means that support for C++ is still very limited.
- Function stripping. Currently there is no support for
--gc-sections
so functions and data from a given object will linked as a unit. - Section start/end symbols. The synthetic symbols that mark the start and of data regions are not yet created in the output file.
Windows support¶
LLD supports Windows operating system. When invoked as lld-link.exe
or with
-flavor link
, the driver for Windows operating system is used to parse
command line options, and it drives further linking processes. LLD accepts
almost all command line options that the linker shipped with Microsoft Visual
C++ (link.exe) supports.
The current status is that LLD can link itself on Windows x86/x64 using Visual C++ 2013 as the compiler.
Development status¶
- Driver
- Mostly done. Some exotic command line options that are not usually
used for application develompent, such as
/DRIVER
, are not supported. - Linking against DLL
- Done. LLD can read import libraries needed to link against DLL. Both export-by-name and export-by-ordinal are supported.
- Linking against static library
- Done. The format of static library (.lib) on Windows is actually the same as on Unix (.a). LLD can read it.
- Creating DLL
- Done. LLD creates a DLL if
/DLL
option is given. Exported functions can be specified either via command line (/EXPORT
) or via module-definition file (.def). Both export-by-name and export-by-ordinal are supported. - Windows resource files support
- Done. If an
.res
file is given, LLD converts the file to a COFF file using LLVM’s Object library. - Safe Structured Exception Handler (SEH)
- Done for both x86 and x64.
- Module-definition file
- Partially done. LLD currently recognizes these directives:
EXPORTS
,HEAPSIZE
,STACKSIZE
,NAME
, andVERSION
. - Debug info
- Done. LLD can emit PDBs that are at parity with those generated by link.exe. However, LLD does not support /DEBUG:FASTLINK.
Downloading LLD¶
The Windows version of LLD is included in the pre-built binaries of LLVM’s releases and in the LLVM Snapshot Builds.
Building LLD¶
Using Visual Studio IDE/MSBuild¶
- Check out LLVM and LLD from the LLVM SVN repository (or Git mirror),
- run
cmake -G "Visual Studio 12" <llvm-source-dir>
from VS command prompt, - open LLVM.sln with Visual Studio, and
- build
lld
target inlld executables
folder
Alternatively, you can use msbuild if you don’t like to work in an IDE:
msbuild LLVM.sln /m /target:"lld executables\lld"
MSBuild.exe had been shipped as a component of the .NET framework, but since 2013 it’s part of Visual Studio. You can find it at “C:\Program Files (x86)\msbuild”.
You can build LLD as a 64 bit application. To do that, open VS2013 x64 command prompt and run cmake for “Visual Studio 12 Win64” target.
Using Ninja¶
- Check out LLVM and LLD from the LLVM SVN repository (or Git mirror),
- run
cmake -G ninja <llvm-source-dir>
from VS command prompt, - run
ninja lld
LLD 8.0.0 Release Notes¶
Warning
These are in-progress notes for the upcoming LLVM 8.0.0 release. Release notes for previous releases can be found on the Download Page.
Introduction¶
This document contains the release notes for the lld linker, release 8.0.0. Here we describe the status of lld, including major improvements from the previous release. All lld releases may be downloaded from the LLVM releases web site.
Non-comprehensive list of changes in this release¶
ELF Improvements¶
- Item 1.
COFF Improvements¶
- Item 1.
MachO Improvements¶
- Item 1.