26 Aug 2013

emscripten and PNaCl: Build Systems

I recently ported Nebula3 to Google's PNaCl. Main motivation was that I wanted to see how it compares to asm.js both for performance and "ease of use". This was basically a drive-by port, I didn't want to put too much effort into it. Thankfully I had old NaCl code lying around which I could reuse and after 2 or 3 afternoons (and some WTF-moments) I had a pretty clean port running which I'm planning to keep updated into the foreseeable future.

The big news about PNaCl is that deployment no longer has to go through the Chrome Web Store, instead it is now finally possible to host PNaCl applications from any URL.

You can check out the Nebula3 PNaCl demos here: http://www.flohofwoe.net/demos.html. Just make sure you're running the latest Google Chrome Canary, and if an error pops up that PNaCl isn't enabled, just restart Chrome, and wait a little bit. First start can take up to one minute, since PNaCl support is installed on demand which is a multi-MByte download.

Over the next few weeks I'm intending to write up a little series of blog posts comparing the PNaCl and emscripten Nebula3 ports. From a coder's perspective, the two systems are actually fairly close when seen from high above.

As a "pragmatic programmer", I don't really care about the political side. Both asm.js and PNaCl had to take a lot of flak from web purists. The only thing that counts to me is that both technologies provide a seamless software distribution channel directly from the coder to the user. No app shops, gate-keepers, code-signing-certificates or approval processes inbetween.

The Build System

First step is of course to get the SDKs. Both emscripten and PNaCl offer a GCC-style cross-compiling toolchain based on Clang-LLVM. Quick disclaimer: I'm running on OSX, haven't looked at the Windows side of things yet.

The emscripten SDK is simply installed and updated through a github repository. There's a stable master branch, and a bleeding-edge incoming branch. emscripten requires a couple of external tools, most notably Clang-LLVM, python and node.js. Even though clang is the standard compiler on OSX I installed a separate version because emscripten required a newer version then was installed on OSX 10.7. Paths to external tools must be provided through a .emscripten config file in your home dir.

The NaCl SDK is a normal download-archive which should be unzipped to a nacl_sdk directory in your home directory. This download only contains a script file called "naclsdk" which takes care of downloading and updating the actual SDK files in the future. The NaCl SDK contains versioned bundles, each of which is actually a complete SDK in itself, with tools, headers, libraries and examples. This is the same philosophy as the DirectX SDKs. You pick a version to work with and decide yourself when to switch to a newer version, this guarantees you a stable API, and gives the dev team the freedom to change APIs in new versions without breaking code compiled against older versions.

One challenge about the NaCl SDK is to find the right compiler tools and runtime libs since there are so many choices. The "classic" CPU-specific NaCl had different toolchains for ARM and Intel CPU architectures, and two different C runtime libs to choose from: newlib or glibc.

PNaCl is much simpler though: there are no longer different target CPU architectures since PNaCl executables are essentially LLVM bitcode, and the only available C runtime lib is newlib (which is the better choice anyway, since it is much slimmer then glibc).

In Nebula3 I'm using cmake to generate build files for different target platforms and build systems / IDEs. For each platform, you build a so called toolchain file which contains paths to the cross-compiling tools, search paths to headers and libraries, and compiler/linker settings.

Writing such a toolchain file can be a bit of guess work, but there are examples flying around the net, also emscripten comes with sample cmake toolchain files which might be helpful as a starting point.

Here are a couple of tips which might save you a some trouble:

  • don't set "ld" as the linker tool, in both toolchains the normal compiler tool also serves as linker (in emscripten this is emcc, in PNaCl use pnacl-clang++
  • PNaCl requires an additional post-build-step after linking, called pnacl-finalize, cmake has the add_custom_command macro for this

To properly separate the different build files I have a directory structure like this:

nebula3/
    code/
    cmake/
        emscripten_asmjs/
        emscripten_debug/
        pnacl_release/
        pnacl_debug/

All the source code lives under /code, and all the build files are generated under cmake/ with one directory per target platform and build configuration.

To actually generate the build files, I have a couple of shell scripts under /code which invoke cmake like this:

cd ../cmake/emscripten_asmjs
cmake -G "Eclipse CDT4 - Ninja" -DCMAKE_BUILD_TYPE="AsmJS" -DNEBULA_PLATFORM=EMSCRIPTEN -DCMAKE_TOOLCHAIN_FILE="../../bin/emscripten.toolchain.cmake" ../../code

The -G option is the cmake "generator", we're telling cmake here that we want Eclipse project files using the ninja build tool (ninja is a more modern make alternative). *-DCMAKE_BUILD_TYPE* sets the AsmJS build config (cmake lets us define any number of custom configs, commonly just Release and Debug but in emscripten I have defined an extra AsmJS config), then -DNEBULA_PLATFORM=EMSCRIPTEN is one of our own custom symbol definitions, this simply tells our cmake files, that we're building for the emscripten target platform (actually this is redundant, a better place for this definition would be the toolchain file). Next we tell cmake which toolchain file to use, and finally where the source code is located (or more specifically: where to find the root CMakeLists.txt file - CMakeLists.txt files tell cmake what targets to build, and from what sources).

When cmake has run, we could import the generated project into Eclipse, or we can just run ninja from the command line:

ninja invocation

Writing a proper cmake based build environment can be a lot of work, but it is definitely worth it. Managing a multi-platform build environment across Linux, OSX and Windows and probably several game consoles, spanning different IDEs like Visual Studio, Xcode and Eclipse would be a nightmare without a meta-build-tool like cmake.

Deployment

Big jump here, but no worries, I'll deal with all the inbetween-stuff in the following blog posts.

The common thing between emscripten and PNaCl when deploying is that the generated files are embedded into a web page, and thus can be easily integrated into existing web site build- and deployment-processes.

The details are a little bit different between the two though:

An emscripten "executable" is either a .js file or a complete HTML page (the so called shell page) which embeds the generated Javascript code. The emscripten linker looks at the output file extension to decide whether it should generate a .js or .html file. Emscripten comes with a default html shell file which should be used as starting point for a customised web page.

Integrating emscripten generated code into a web page is just the same as integrating any piece of complex Javascript code. Since emscripten-generated code is just Javascript, it is also very easy to interact with the rest of the page through direct JS function calls.

PNaCl on the other hand integrates like a plugin into the HTML page using the embed element:

<embed src="dragons.nmf" class="pnacl" id="pnacl_module" name="pnacl_module" width="800" height="452" type="application/x-pnacl"/>

Instead of the .pexe file, a .nmf manifest file is given to the embed element which contains the name of the .pexe file (this manifest file used to look more interesting in classic NaCl since it contained one entry for each target cpu architecture, but for PNaCl there's only one useful piece of information):

{
    "program": {
        "portable": {
            "pnacl-translate": {
                "url": "dragons.pexe"
            }
        }
    }
}

Finally, the type="application/x-pnacl" attribute is important for Chrome to recognise the embed element as a PNaCl application.

Interaction between a PNaCl application and the surrounding web page works through the Javascript messaging system. To get events from the PNaCl application, just add event listeners to the embed element:

<script type="text/javascript">
    // ...
    var naclModule = document.getElementById("pnacl_module");
    naclModule.addEventListener('loadstart', handleLoadStart, true);
    naclModule.addEventListener('progress', handleProgress, true); 
    naclModule.addEventListener('load', handleLoad, true);
    naclModule.addEventListener('error', handleError, true);
    naclModule.addEventListener('crash', handleCrash, true);
    naclModule.addEventListener('message', handleMessage, true);
    // ...
</script>

The other way around works as well, by sending messages to the PNaCl app through postMessage.

The End

Ok, that's it. Next up I'll go through the changes to the Nebula3 Application Model which were necessary for the web platforms!

Written with StackEdit.