ppx Last updated June 30, 2022

With proper toolchain support, PPX management becomes a little more complicated.

When a build target depends on a tool (an executable) that also gets built, Bazel makes a configuration transition to ensure the tool is built with the proper toolchain. Since such tools are presumably intended to run on the build host, the transition is intended to ensure the toolchain used to build the tool targets the build host.

The configuration transition selects a toolchain for building the tool, and by transitivity that toolchain will be used to build any dependencies of the tool. This works, but ppx codeps throw a spanner into the works. PPX codependencies are injected into code transformed by a PPX executable. That means they must be built using the same toolchain as the one used for the target whose dependency on the PPX caused the configuration transition.

For example, suppose we are building module A with a native→vm toolchain. That means we build A with ocamlc.opt, which is native compiled (it runs on the build host, which is the local "native" machine) but emits bytecode (its target host is the OCaml VM).

Suppose also that target A has attribute ppx = ":ppx.exe". This attribute is marked executable = True, which means it must also be defined with a cfg property set to either exec or target. The former tells Bazel to transition to the platform configuration of the build host; the latter, to that of the target host. For cross-compilation, setting it to exec prevents the tool from being built to the target platform. Note that cfg may also be set to a user-defined transition function.

Now ppx.exe will have its own dependencies; they will be built using the same toolchain selected for building ppx.exe, since the configuration used to select the toolchain for each target will be the one set for ppx.exe - which is controlled by the cfg property of the ppx attribute of the A target. Since we’re building A with a cross-compiler (from native to bytecode), the cfg="exec" transition will set the parameters that select the native→native toolchain for building ppx.exe and its dependencies. We’ll call this the "ppx toolchain".

But now suppose one of those dependencies is a ppx_module[1]: — let’s call it module PpxB --that injects a dependency into the files it transforms. For example suppose it injects some code that depends on module C. Then when ppx.exe (which includes PpxB) transforms a file, say foo.ml, the result will be a source file whose compilation depends on C. Which means in turn that module C must be compiled with the original native→vm toolchain used to compile module A, not the native→native PPX toolchain.

With OBazl we express such dependencies using the ppx_codeps attribute of rule ppx_module. The problem then is that these become (indirect) dependencies of ppx.exe, which means they will be compiled with the PPX toolchain, which is native→native. This won’t cause a problem until we try to link A into an executable. The compiler will complain don’t know what to do with c.cmxa, since we’re using ocamlc.opt which expects to see .cma files on the command line, not .cmxa files.

This is of course a toy example, but this situation would arise in true cross-compilation scenarios, say where the build host is x86 and the target host is arm. In that case, the ppx_codeps would be built with the x86→x86 toolchain instead of the x86→arm toolchain, with the same results.

Just to complicate things a little more: ppx module sources themselves may undergo PPX transformation. So we could have a scenario where we need to ppx process a.ml using ppxA.exe, which depends on a ppx module needing ppx transformation by ppxB.exe, which depends on another ppx module needing transformation by ppxC.exe, and so on.

For now we just set cfg = "target", and that works where the cross-compile is just between native and bytecode. We can get away with this because we can always run the PPX executable on either platform, native or bytecode. So for example if we’re using the native→vm toolchain, we do not need to target the native platform for our ppx executables, because we can always run the vm. In other words the VM is always available to us even when we’re running our tools natively.

 But this won't work when we get to real cross-compiling like
x86->arm. For example suppose our local host is x86 and we want to
target ARM. Then we will need an x86->arm cross-toolchain to serve as
our primary toolchain running on our local build host. But when it
comes time to build PPX executables and run them (on our build host)
to transform source code, we will need a secondary x86->x86 toolchain.
And we will have to ensure that PPX co-dependencies are built with the
primary x86->arm toolchain.

Here is how OBazl plans to address the issue.

  • We attach a transition function to the ppx attribute (of ocaml_module and ocaml_signature)

    • the transition function writes the current toolchain config to a pair of build settings

For ppx codeps we have two scenarios:

  • Precompiled codeps that we import using opam_import (or ocaml_import). The import rules import both bytecode (.cma) and native (cmxa) archives. Rule logic reads the toolchain config info to decide which to integrate into the build.

  • codeps that we need to build. In this case, we have to ensure that the correct toolchain is selected for ppx codependencies.

We attach another transition function of attr ppx_codeps. By the time it receives control, the toolchain build settings have been written (by the ppx transition function), so the ppx_codeps transition function uses that information to reset the build settings determining toolchain selection.

The build settings recording the toolchain config params are only used by opam_import? Everywhere else its a matter of toolchain selection, which is handled by Bazel.


1. The ppx_module rule is the same as a standard ocaml_module rule, except that it has a ppx_codeps attribute, which ocaml_module does not support, and it delivers a PpxModuleMarker provider. Ppx modules may depend on standard modules, but standard modules cannot depend on ppx modules.