This document contains my findings when playing with C++20 modules and the possibility of a modular Boost. The analysis is oriented towards analyzing the costs and benefits of building a distributing existing Boost libraries as C++20 modules, so that users can write import boost
in their code. The analysis focuses on header-only libraries.
The experiment is available in this repository.
Update: a follow-up post is now available.
A mental model for modules
Quick recap: modules are a C++20 language-level construct that aims to provide better encapsulation and reduce build times. Modules add a new translation unit type: module units. Modules are built into an artifact called BMI.
In our context, as an overly simplification, we can think of them as "well behaved precompiled headers", with the following differences:
-
You can import as many modules as you want, while you can only include one precompiled header.
-
Declarations are not exported from modules by default -
export
needs to be used on the declarations to be exported. -
Macros can’t be exported from modules.
Like PCHs, BMIs are highly non-portable and must be built by the library user (and not by us), and need some cooperation from the build system. For header-only libraries, this is the user’s build system (e.g. CMake), not the one we use (b2).
For each import
the user places in their TU, the compiler needs a corresponding BMI. This introduces a build-time dependency (like precompiled headers do). The build system needs to scan TUs for imports to build the dependency graph.
Header units (import <boost/asio.hpp>;
) are supposed to be an intermediate between headers and modules. I haven’t explored them because they don’t have CMake support, and are supposedly slower.
Compiler and tooling support
At the time of writing this article, support for modules is in its early stages. Concretely, when using CMake, the following requirements apply:
-
CMake 3.28 (module stable support has been out for 2 releases).
-
MSVC 14.34, clang-16, or gcc-14 (this hasn’t been released yet, but support has already been merged).
-
When building for UNIX, Ninja 1.11.
Note that build2 has support for modules and even header units (where CMake hasn’t). As we only ship CMake files, I haven’t investigated build2 further.
C++23 enables import std
, which is supposed to improve compile-times. This is currently less developed:
-
MSVC standard library and libc++ ship with standard modules, while libstdc++ doesn’t support them yet.
-
In both cases, the library ships with the module source code, and the user needs to build the modules themselves.
-
There is intention from CMake to add support for building standard modules in an easy way, but there is nothing yet. Manual approaches are possible for testing but not adequate for production.
No IDE supports modules yet (clangd-19
enters an infinite loop when it sees an import
), which disables autocompletion and highlighting.
This means that this feature is very unlikely to be used in production right now. We may encounter early adopters and pet projects, but not many industry users yet.
How to modularize a library
Before measuring compile times, we need a familiar library to be consumed as a module. While Matt Borland and John Maddock have done a great job writing a modular version of Boost.Math, I don’t have realistic, slow-to-compile code to perform adequate benchmarks. So I’ve gone ahead and modularized standalone Asio.
While Math’s approach works, it’s intrusive and requires a lot of work. An easier approach (employed by libc++) is exporting the required names with using
declarations:
// File asio.cxx. Defines how to build Asio as a module so it can be imported
module;
#include <asio.hpp>
export module asio;
namespace asio {
export using asio::io_context;
export using asio::post;
export using asio::any_io_executor;
// ...
}
There are a number of caveats with this approach though:
-
constexpr
variables need to beinline
to be exported. This is not the case in most libraries, although it should be. This requires submitting PRs to libraries. As mentioned by @opensdh, there is a proposal to make this easier. -
Some libraries define template specializations in other namespaces, like
std
. With the current language rules, these cause trouble when employingexport using
. See this section on the follow-up article for more info about this topic. -
Making a definition available using
import
in some TUs and#include
in others seems to work when employing theexport using
technique, but may cause ODR violations otherwise, as pointed by Peter Dimov. This is troublesome considering the above point.
Measuring build-time benefits
The experiment involves building a simple, async, coroutine-based server that listens for connections, reads and writes using SSL.
As with PCHs, only executables with several translation units including similar headers may see benefits. The benchmark involves adding translation units, all of them building the same server, with and without modules.
Benchmark conditions:
-
clang-19 (Linux)
-
Release CMake build
-
Building with 3 cores to measure the effects of parallelism.
-
Modular builds use Asio and the standard library as modules. The time to build such modules is included in the benchmark.
Number of translation units | Build time (modules) | Build time (headers) |
---|---|---|
1 TU |
09.124 |
06.909 |
2 TU |
10.708 |
07.446 |
3 TU |
12.280 |
09.773 |
4 TU |
14.786 |
16.057 |
5 TU |
16.065 |
16.631 |
6 TU |
16.374 |
17.972 |
7 TU |
20.966 |
24.695 |
Benefits are not as big as expected. Compiling with -ftime-trace
with modules shows the following:
-
The slower to build artifacts are the
std
module, the Asio module and the server TUs. -
The
std
andasio
modules build in parallel (Asio uses includes forstd
). The server TUs require the module objects and won’t start building until the former are ready. -
Each of the two modules take around 4s to build. This is spent including headers and parsing declarations.
-
Building server TUs take 6s in total: 2s in the compiler’s frontend (performing instantiations) and 4s in the backend (performing optimizations).
-
The header version takes 9s. 3s are spent parsing headers, which is not present in the module version.
-
Rebuilds (as happen during local development) are significantly faster in the module version - see my follow-up post for details.
Although non-zero, I find the gains slightly disappointing. These may be bigger for bigger projects, debug builds or different libraries. The benefits on re-builds may be enough for some users to consider modules, though.
Consuming Boost using modules
If we write module code for some Boost libraries, we need to ship the code and provide users with a way to build and consume it. As we ship CMake bindings with our libraries, the obvious path is to enhance this to include building Boost modules.
This is what the end user’s CMake could look like:
# Same as today
find_package(Boost REQUIRED)
# A function defined by find_package(Boost). Builds the Boost.Asio module into a target named asio_module
add_boost_asio_module(asio_module)
# Possibly set compile flags required by dependent targets
# Use the module
add_executable(server main.cpp)
target_link_libraries(server PRIVATE asio_module)
This resembles the pch
rule in B2. Under the hood, the function creates a library target that builds the corresponding Boost module. For instance:
function (add_boost_asio_module NAME)
set(ROOT @CMAKE_INSTALL_PREFIX@)
add_library(${NAME})
target_include_directories(${NAME} PRIVATE ${ROOT}/include)
target_compile_features(${NAME} PUBLIC cxx_std_23)
target_sources(${NAME} PUBLIC
FILE_SET modules_public TYPE CXX_MODULES FILES
${ROOT}/module/asio.cxx
)
endfunction()
A function may be more appropriate than an actual target because the module may need to be built several times, with different flags and definitions.
Such an approach requires non-trivial changes in either Boost.CMake or boost_install
. Note that vcpkg
users would not be able to access this, since vcpkg
does not use the official Boost CMake modules. conan
and system package managers would benefit.
Conclusion
-
Modules are in a very early stage yet. We won’t get lots of production users with this.
-
A "module-only" Boost2 is probably not a good idea at this point.
-
Modules may provide some compilation speed-up, but they’re not a panacea. Instantiation time isn’t affected by modules. You’re not wasting your time making your libraries less header-only.
-
Providing modular "bindings" for some Boost libraries may be interesting to gain some real-world experience from early adopters.