But this won't work when we get to real cross-compiling like x86->arm. For example suppose our local host is x86 and we want to target ARM. Then we will need an x86->arm cross-toolchain to serve as our primary toolchain running on our local build host. But when it comes time to build PPX executables and run them (on our build host) to transform source code, we will need a secondary x86->x86 toolchain. And we will have to ensure that PPX co-dependencies are built with the primary x86->arm toolchain.
ppx Last updated June 30, 2022
With proper toolchain support, PPX management becomes a little more complicated.
When a build target depends on a tool (an executable) that also gets built, Bazel makes a configuration transition to ensure the tool is built with the proper toolchain. Since such tools are presumably intended to run on the build host, the transition is intended to ensure the toolchain used to build the tool targets the build host.
The configuration transition selects a toolchain for building the tool, and by transitivity that toolchain will be used to build any dependencies of the tool. This works, but ppx codeps throw a spanner into the works. PPX codependencies are injected into code transformed by a PPX executable. That means they must be built using the same toolchain as the one used for the target whose dependency on the PPX caused the configuration transition.
For example, suppose we are building module A
with a native→vm
toolchain. That means we build A with ocamlc.opt
, which is native
compiled (it runs on the build host, which is the local "native" machine)
but emits bytecode (its target host is the OCaml VM).
Suppose also that target A
has attribute ppx = ":ppx.exe"
. This
attribute is marked executable = True
, which means it must also be
defined with a cfg
property set to either exec
or target
. The
former tells Bazel to transition to the platform configuration of the
build host; the latter, to that of the target host. For
cross-compilation, setting it to exec
prevents the tool from being
built to the target platform. Note that cfg
may also be set to a
user-defined transition function.
Now ppx.exe
will have its own dependencies; they will be built using
the same toolchain selected for building ppx.exe
, since the
configuration used to select the toolchain for each target will be the
one set for ppx.exe
- which is controlled by the cfg
property of
the ppx
attribute of the A
target. Since we’re building A
with a
cross-compiler (from native to bytecode), the cfg="exec"
transition
will set the parameters that select the native→native toolchain for
building ppx.exe
and its dependencies. We’ll call this the "ppx
toolchain".
But now suppose one of those dependencies is a ppx_module
[1]: — let’s
call it module PpxB
--that injects a dependency into the files it
transforms. For example suppose it injects some code that depends on
module C
. Then when ppx.exe
(which includes PpxB
) transforms a
file, say foo.ml
, the result will be a source file whose compilation
depends on C
. Which means in turn that module C
must be compiled
with the original native→vm toolchain used to compile module A
,
not the native→native PPX toolchain.
With OBazl we express such dependencies using the ppx_codeps
attribute of rule ppx_module
. The problem then is that these become
(indirect) dependencies of ppx.exe
, which means they will be
compiled with the PPX toolchain, which is native→native. This won’t
cause a problem until we try to link A
into an executable. The
compiler will complain don’t know what to do with c.cmxa
, since
we’re using ocamlc.opt
which expects to see .cma
files on the
command line, not .cmxa
files.
This is of course a toy example, but this situation would arise in true cross-compilation scenarios, say where the build host is x86 and the target host is arm. In that case, the ppx_codeps would be built with the x86→x86 toolchain instead of the x86→arm toolchain, with the same results.
Just to complicate things a little more: ppx module sources themselves may
undergo PPX transformation. So we could have a scenario where we need
to ppx process a.ml
using ppxA.exe
, which depends on a ppx module
needing ppx transformation by ppxB.exe
, which depends on another ppx
module needing transformation by ppxC.exe
, and so on.
For now we just set cfg = "target"
, and that works where the
cross-compile is just between native and bytecode. We can get away
with this because we can always run the PPX executable on either
platform, native or bytecode. So for example if we’re using the
native→vm toolchain, we do not need to target the native platform for
our ppx executables, because we can always run the vm. In other words
the VM is always available to us even when we’re running our tools
natively.
Here is how OBazl plans to address the issue.
-
We attach a transition function to the
ppx
attribute (ofocaml_module
andocaml_signature
)-
the transition function writes the current toolchain config to a pair of build settings
-
For ppx codeps we have two scenarios:
-
Precompiled codeps that we import using
opam_import
(orocaml_import
). The import rules import both bytecode (.cma
) and native (cmxa
) archives. Rule logic reads the toolchain config info to decide which to integrate into the build. -
codeps that we need to build. In this case, we have to ensure that the correct toolchain is selected for ppx codependencies.
We attach another transition function of attr ppx_codeps
. By the
time it receives control, the toolchain build settings have been
written (by the ppx
transition function), so the ppx_codeps
transition function uses that information to reset the build settings
determining toolchain selection.
The build settings recording the toolchain config params are only used by opam_import? Everywhere else its a matter of toolchain selection, which is handled by Bazel.
ppx_module
rule is the same as a standard ocaml_module
rule, except that it has a ppx_codeps
attribute, which ocaml_module
does not support, and it delivers a PpxModuleMarker
provider. Ppx modules may depend on standard modules, but standard modules cannot depend on ppx modules.