The Brain Dump: Adding functionality to threaded subsystems

Moving subsystems into their own thread introduces restrictions on how other threads can interact with the subsystem. It is no longer possible to simply invoke methods on objects running in the context of a threaded subsystem. The only way to interact with the subsystem is by sending messages to it. From a system design point-of-view this is a good thing. There's a very clear demarcation line defined by the message protocol to interact with the subsystem. It is pretty much impossible to invoke undocumented functionality from the outside and it is complicated to "accidently" use the subsystem's functionality in a way not intended by the subsystem's designer.

But of course those restrictions also have their dark side. All tasks which either require a lot of communication, or which require exact synchronization should better not be spread across threads. Although the messaging system is fast (and will remain an optimization hotspot) it is not free, it's not a good idea to send thousands (or even hundreds) of messages around per-frame. Also, a message sender should never wait for the completion of a message to work around the synchronization problem (at least not while the game loop is running), as this would pretty much nullify the advantage of running the subsystem in its own thread.

Nebula3 offers a relatively simple way to add functionality which shall run in the context of a subsystem thread. The basic idea is to create a new message-handler class (which is running in the subsystem's thread) and a new set of messages which can be processed by an instance of the new handler-class.

We recently did this to add debug-visualization capability to Nebula3. We wanted to have a simple way to (a) render debug text, and (b) render shapes (cubes, spheres, etc...) to make it simple to render debug-visualizations from anywhere in Nebula3.

The whole system is split into 3 parts:

The front-end classes running on the client-side (client-side means: every thread other then the render thread):

the Debug::DebugTextRenderer singleton offers text rendering
the Debug::DebugShapeRenderer singleton offers shape rendering
both are thread-local singletons, each thread which wants to render debug text or shapes needs to instantiate those

The back-end classes running in the render-thread:

CoreGraphics::TextRenderer
CoreGraphics::ShapeRenderer
these singletons implement the actual text- and shape-rendering functionality and are also platform-specific (under Windows, they use D3DX methods to do their jobs)

The communication components:

the Debug Render message protocol, this is a NIDL-XML-file (Nebula Interface Definition Language) which defines 2 messages: RenderDebugText and RenderDebugShapes
the DebugGraphicsHandler object, whose class is derived from Messaging::Handler, runs in the render thread, and processes the above 2 messages

This is how the system works:

the main thread instructs the GraphicsInterface singleton (which creates and manages the render-thread) to add a DebugGraphicsHandler object (that's at least how it SHOULD work, at the moment, the GraphicsHandler simply creates and attaches a DebugGraphicsHandler on its own)
client threads create one DebugTextRenderer and one DebugShapeRenderer singleton if they want to do debug visualization
a client-thread calls directly one of the DebugTextRenderer or DebugShapeRenderer methods to render text or shapes
the DebugTextRenderer and DebugShapeRenderer singletons collect a whole frame's worth of text elements and shapes and once per frame, create a single RenderDebugText and RenderDebugShapes message, so at most only 2 messages are sent into the render thread per-frame from each client-thread, not one message per shape and text element, that's a very important optimization!
Once per render-frame, the DebugGraphicsHandler processes incoming RenderDebugText and RenderDebugShapes by calling the CoreGraphics::TextRenderer and CoreGraphics::ShapeRenderer singletons

That's it basically. Nebula3 applications can add their own functionality to subsystem threads by following the described pattern.

With the first naive implementation we stumbled across an obvious problem: when the main-thread runs slower then the graphics thread, debug shapes and text would start to flicker, since the render thread would only receive render-debug-messages every other frame. So we had to add a way to identify shapes and text elements by their origin-thread-id, and keep them around until the next message comes in from the same thread, but this was a trivial thing to do.

A positive effect is that debug visualization no longer needs to happen at a specific point in the render loop. This was a problem in Nebula2/Mangalore where classes had to provide an "OnRenderDebug()" method which was called by the rendering system from within the render loop. Instead debug visualization can now happen from anywhere in the code (although at the cost of some more memory and communications overhead, but especially debug visualizations is an area where convenience and ease-of-use is more important then raw performance).

FYI, this is how the NIDL-file looks like, which defines the messages of the DebugRender protocol:

<?xml version="1.0" encoding="utf-8"?>

</Message>

</Message>

</Protocol>

</Nebula3>

This will be compiled by the Nebula3 NIDL-compiler-tool into one C++ header and one source file (debugrenderprotocol.h and debugrenderprotocol.cc).

I hope to have a new source drop out "really-soon-now", so you can check for yourself what I'm actually talking about :)

20 Sept 2008

Adding functionality to threaded subsystems