The Brain Dump: 06/09

27 Jun 2009

A more streamlined shader system

I’m currently porting N3’s shader system to the PS3, and while I got it basically working, I’m not satisfied because it’s not really possible to come up with a solution that’s really optimized for the PS3 without creating my own version of the DirectX FX system. Currently N3 depends too much on obscure FX features which actually aren’t really necessary for a realtime 3D engine. So this is a good time to think about trimming off a lot of fat from the shader classes in CoreGraphics.

I think the higher level state management parts are in pretty good shape. The FrameShader system divides a render-frame into render-passes, posteffect-passes and render-batches, draw-calls are grouped by shader-instance. All together these do a good job of preventing redundant state switches. So above the CoreGraphics subsystem (almost) everything is fine.

Here’s how shaders are currently used in N3:

FramePass: a pass-shader sets render states which are valid throughout the whole pass (for instance, a depth-pass may disable color-writes, and so on…), but doesn’t contain vertex- or pixel-shaders
FrameBatch: this is one level below passes, a batch-shader sets render state which is valid for rendering a batch of ModelNodeInstances with the same ModelNodeType (for instance, a render batch which contains all SrcAlpha/InvSrcAlpha-blended objects would configure the alpha blending render states accordingly). FrameBatches usually don’t have a vertex/pixel-shader assigned, but there are cases where this is useful, if everything that is rendered during the batch uses the same shader anyway.
PostEffects: a post-effect-pass uses a shader to render a fullscreen-quad
StateNodes and StateNodeInstances: node-instances are always rendered in batches, so that material parameters only need to be set once at the start of a node-batch, other parameters (most notably the ModelViewProjection matrix) must be set before each draw-call.
Specific subsystems may use shaders for their own rendering as they please, but let’s just say that those will have to live with the limitations of the new shader system ;)

This defines a pretty small set of requirements for a refactored Nebula3 shader system. By removing feature requirements and making the system *less* flexible, we have a better foundation for low-level optimizations and especially for porting the code to other platforms without compromises.

Here’s a quick overview of the current shader system:

The important classes and concepts to understand N3’s shader system are Shader, ShaderInstance, ShaderVariation and ShaderVariable. The basic idea of the shader system is to move all the slow stuff into the setup phase (for instance looking up shader variables) instead of doing it every frame.

A Shader encapsulates an actual shader file loaded from disc. Shaders are created once at application start, one Shader object per shader file in the shd: directory. Shaders are not actually used for anything except providing a template for ShaderInstances.

A ShaderInstance is created from a specific shader and contains its own set of shader parameter values. For instance, every Models::StateNode object creates its own ShaderInstance and initializes it with its specific material parameters.

A ShaderVariable is created from a ShaderInstance and is used to set the value of a specific shader parameter of the ShaderInstance.

ShaderVariations are the most interesting part of the system. Under the hood, a ShaderVariation is an FX Technique, and thus can have a completely different set of render states and vertex/pixel shader code. The variation that is currently used for rendering is selected by a “feature bit mask”, and the bits of the feature mask can be set and deleted in different parts of the rendering pipeline. For instance, the Depth Pass may set the “Depth” feature bit, and a character rendering node may set the “Skinned” feature bit, resulting in the feature mask “Depth|Skinned”. A Shader can contain any number of variations, each with a specific feature mask. Thus a shader which supports rendering in the depth pass, and also supports skinning would implement (among others) a variation “Depth|Skinned” for rendering skinned geometry during the depth pass.

The biggest problem with the current system is that ShaderVariables are so damn powerful, thanks to the FX system. FX implicitly caches the current value of shader variables, so that a variable can be set at any time before or after the shader is actually used for rendering. FX also tracks changes to shader variables and knows which shader variables need to be updated when it comes to rendering. This extra caching layer is actually not really required by Nebula3 and removing it allows for a much more streamlined implementation if FX is not available or not desirable.

Here are the changes I currently have in mind:

A lot of reflection information will not be available. A Shader object won’t be required to enumerate the shader parameters and query their type. The only information that will probably survive is whether a specific parameter exists, but not its datatype, array size, etc… This is not really a problem for a game application, since the reflection information is usually only needed in DCC tools.
Querying the current value of a shader parameter won’t be possible (in fact that’s already not possible in N3, even though FX offers this feature).
ShaderInstances will offer a way to setup shader parameters as “immutable”. Immutable parameters are set once but cannot be changed afterwards. For instance, the material parameters of a StateNode won’t be changed once they are initialized, and setting them up as immutable allows the shader to record them into some sort of opaque command buffer or state block and then completely forget about them (so it doesn’t have to keep track of immutable parameters at all once they have been setup).
Mutable shader parameters (like the ModelViewProjection matrix) are not required to cache their current value, this means that N3 may only set mutable shader parameters after the ShaderInstance has been set as the active shader. This may be a bit inconvenient here and there, but it relieves the low-level shader system from a lot of house-keeping.
With the 2 previous points a ShaderInstance doesn’t have to keep track of shader parameters at all. All the immutables are stored away in command buffers, and all the mutables could directly go through to the rendering API.
It won’t be possible to change rasterization state through shader variables at all (like CullMode or AlphaRef - basically, everything that’s set inside a FX Technique), shader variables may only map to vertex- or pixel-shader uniform constants.
N3 still needs to call a Commit() method on the shader instance after setting mutable parameters and before issuing the draw-call since depending on the platform, some housekeeping and batching may still be necessary for setting mutable parameters efficiently.
Shared shader parameters will (probably) go away. They are convenient but hard to replicate on others platforms. Maybe a similar mechanism will be implemented through a new N3 class so that platform ports have more freedom in implementing shared shader parameters (shared parameters will likely have to be registered by N3 instead of being automatically setup by the platform’s rendering API).
The last fix will be in the higher-level code where ModelNodeInstances are rendered:

Currently, draw-calls are grouped by their ShaderInstance, but those instance-groups themselves are not sorted in a particular order. In the future, rendering will also be sorted by Shader and active ShaderVariation, this allows to change the vertex and pixel shaders less frequently (changing the pixel shader is particularly bad).
Currently, a ShaderInstance object contains a complete cloned FX effect instance, in the future, a shader instance will not use a cloned effect - instead, immutable parameters will be captured into a parameter block and mutable parameter changes will be routed directly to the “parent” FX effect.

Another mid-term-goal I want to aim for is to reduce the number of per-draw-call shader-parameter-changes as much as possible. There are a lot of lighting parameters fed into the pixel shader for each draw call currently. Whether this means going to some sort of deferred rendering approach or maybe setting up per-model lighting parameters in a dynamic texture, or trying to move more work back into the vertex shader I don’t know yet.

20 Jun 2009

N3 I/O Tips & Tricks

Note: all of the following code-fragments assume:

using namespace Util;
using namespace IO;

Working with Assigns

Assigns are path aliases which are used instead of hardwired file locations. This lets an N3 application use filenames which are independent from the host platform, actual Windows version, or Windows language version. The following “system assigns” are pre-defined in a standard Nebula3 application:

home: Points to the app’s installation directory, or (on console platforms) the root directory of the game content. The home: location should always be treated as read-only!
bin: Points to the location where the app’s executable resides, should be treated as read-only (not available on console platforms)
user: On Windows, points to the logged-in user’s data directory (e.g. on Windows 7 this is c:\Users\[user]\Documents). Can be treated as read/write, and this is where profile-data and save-game-files should be saved. On consoles this assign may point to the save-location, or it may not be available at all (if saving game data is not handled through some sort of file system on that specific platform).
temp: On Windows, this points to the logged-in user’s temp directory (e.g. on Windows 7 this is c:\Users\[user]\AppData\Local\Temp). This directory is read/write but applications should assume that files in this directory may disappear at any time. This assign is not available on console platforms.
programs: On Windows, this points to the standard location for programs (e.g. “c:\Program Files”)
appdata: On Windows, this points to the user’s AppData directory (e.g. c:\Users\[user]\AppData)

Additionally to these system assigns, Nebula3 sets up the following “content assigns” at startup which all applications can rely on:

export: points to the root of the directory where all game data files reside
- ani: root directory of animation files
- data: root directory of general data files
- video: root directory of movie files
- seq: root directory of sequence files (e.g. engine-cutscenes)
- stream: root directory of streaming audio files
- tex: root directory of texture files
- frame: root directory of frame shader files
- mdl: root directory of .n3 files
- shd: root directory of shader files
- audio: root directory for non-streaming audio files
- sui: root directory for “Simple GUI” scene files

More standard assigns may be added in the future.

An application can define its own assigns or override existing assigns using the IO::AssignRegistry singleton:

AssignRegistry::Instance()->SetAssign(Assign(“bla”, “home:blub”));

To use the new assign, simply put it at the beginning of a typical file path:

“bla:readme.txt”

This would resolve to the following absolute filename (assuming your app is called “MyApp” and located under the standard location for programs under Windows):

“C:\Program Files\MyApp\blub\readme.txt”

You can resolve a path name with assigns into an absolute file name through the AssignRegistry, this is usually necessary when working with 3rd party libs:

String absPath = AssignRegistry::Instance()->ResolveAssignsInString(“bla:readme.txt”);

Finally, Assigns are not restricted to file system locations:

AssignRegistry::Instance()->SetAssign(Assign(“bla”, “http://www.radonlabs.de/blub”));

Listing directory content

You can list the files or subdirectories of a directory with pattern matching like this:

// list all files in the app’s export directory
Array<String> files = IoServer::Instance()->ListFiles(“home:export”, “*”);

// list all subdirectories in the app’s export directory
Array<String> dirs = IoServer::Instance()->ListDirectories(“home:export”);

// list all DDS textures of category “leafs”
Array<String> leafTextures = IoServer::Instance()->ListFiles(“tex:leafs”, “*.dds”);

Note that the returned strings are not full pathnames, only the actul file- and directory-names!

Working with directories

You can create directories and subdirectories with a single method call:

IoServer::Instance()->CreateDirectory(“home:bla/blub/blob”);

This will also create all missing subdirectories as needed.

You can delete the tail directory of a path, but the directory must be empty:

IoServer::Instance()->DeleteDirectory(“home:bla/blub/blob”);

This will delete the “blob” subdirectory only, and only if there are no files or subdirectories left under “blob”.

You can check whether a directory exists:

if (IoServer::Instance()->DirectoryExists(“home:bla/blub”))
{
// directory exists
}

NOTE:

creating and deleting directories in archive files doesn’t work
all directory functions only work in the file system, so a DirectoryExists(“http://www.radonlabs.de/bla”) will *not* work

Working with files

The following IoServer methods are available for files, these all work directly with path names, so you don’t need to have an actual Stream object around:

// check whether a file exists:
if (IoServer::Instance()->FileExists(“home:readme.txt”)) …

// delete a file:
IoServer::Instance()->DeleteFile(“home:readme.txt”);

// copy a file:
IoServer::Instance()->CopyFile(“home:src.txt”, “home:dst.txt”);

// check if the read-only flag is set on a file:
if (IoServer::Instance()->IsReadOnly(“home:src.txt”)) …

// set the read-only flag on a file:
IoServer::Instance()->SetReadOnly(“home:src.txt”);

// getting the last modification time of a file:
FileTime fileTime = IoServer::Instance()->GetFileWriteTime(“home:readme.txt”);

// setting the last modification time of a file:
IoServer::Instance()->SetFileWriteTime(“home:readme.txt”, fileTime);

NOTE:

DeleteFile(), SetReadOnly(), SetFileWriteTime() do not work in file archives
CopyFile() doesn not work if the destination is located in a file archive
all of these functions only work with file system paths (not “http://…”)

How to get the size of a file

Currently, the only way to query the size of a file is through an open IO::Stream object. This may change in the future though.

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:readme.txt”);
if (stream->Open())
{
Stream::Size fileSize = stream->GetSize();
stream->Close();
}

Working with file archives

You can mount zip archives as an overlay over the actual file system:

IoServer::Instance()->MountArchive(“home:archive.zip”);

All file system accesses will first check mounted archives (in mount order) before falling back to the actual file system.

Archives have the following restrictions:

writing to archives is not supported
the archive filesystem will keep some sort of table-of-content in memory as long as an archive is mounted, the actually required size of the TOC differs by platform.
in the current zlib-based implementation, complete files are decompressed into memory, thus opening a 100 MB file from an archive will also allocate 100 MB of memory until the file is closed
for the above reason, streaming from archive files doesn’t make sense, thus things like streaming audio wave banks or movie files should not be placed into archive files

On console platforms platform-native file archive formats are used if available (e.g. .PSARC on PS3 or .ARC on Wii) which usually have less restrictions and are better optimized for the platform then plain ZIP support. The Xbox360 port currently uses the standard zlib implementation but this will very likely change in the future.

On Windows (and currently Xbox360), zip support is handled through zlib.

You can add support for other archive formats by deriving subclasses from the classes under foundation/io/archfs, but currently it is not possible to mix different archive formats in one application (because you need to decide on a specific archive-filesystem-implementation at compile-time).

You can turn off the archive file-system layer completely through IoServer::SetArchiveFileSystemEnabled(false). All file accesses will then go directly into the actual file system. This is useful for tools which need to make sure that they don’t accidently read data from an archive file.

Nebula3 defines a “standard archive” where all game data is located. The data in the archive is located under the “export:” assign on all platforms. The actual archive filenames for the various platforms are:

Win32: home:export_win32.zip
Xbox360: home:export_xbox360.zip
Wii: home:export_wii.arc
PS3: home:export_ps3.psarc

The archiver3.exe tool takes care about generating those standard archives as part of the build process, when generating data for console platforms, the actual console SDK must be installed

(however, please note that we cannot currently license the N3 console ports to other companies anyway).

Working with the SchemeRegistry

The IO::SchemeRegistry singleton associates URI schemes (those things at the start of an URI, e.g. “file://…”, “http://…”) with Stream classes. You can override the pre-defined scheme associations or register your own scheme with a stream class of your own:

// register my own scheme and stream class:
SchemeRegistry::Instance()->RegisterUriScheme(“myscheme”, MyStream::RTTI);

// create a stream object by URI:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“myscheme://bla/blub”);

// the returned stream object should be an instance of our derived class:
n_assert(stream->IsInstanceOf(MyStream::RTTI));

You can also override standard associations to route all file accesses through your own stream class like this:

// override the file scheme to use our own stream class:
SchemeRegistry::Instance()->RegisterUriScheme(“file”, MyStream::RTTI);

Reading and writing XML files

Attach an IO::XmlReader object to an IO::Stream object to parse the content of an XML file. The XmlReader can access nodes through path names, so you can navigate XML nodes like files in a file system. The XmlReader tracks a “current node” internally, like the “current directory” in a file system API.

// create an XML reader and parse the file “home:test.xml”:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:test.xml”);
Ptr<XmlReader> xmlReader = XmlReader::Create();
xmlReader->SetStream(stream);
if (xmlReader->Open())
{
    // test if a specific node exists in the XML file:
    if (xmlReader->HasNode(“/Nebula3/Models”))…

    // position the current node on “/Nebula3/Models”:
    xmlReader->SetToNode(“/Nebula3/Models”);

    // iterate over child nodes of current node:
    if (xmlReader->SetToFirstChild()) do
    {
        …
    } while (xmlReader->SetToNextChild());

    // iterate over child nodes named “Model”:
    if (xmlReader->SetToFirstChild(“Model”)) do
    {
        …
    } while (xmlReader->SetToNextChild(“Model”));

    // test if the current node has a “name” attribute
    // and read its value as a string:
    if (xmlReader->HasAttr(“name”))
    {
        String name = xmlReader->GetString(“name”);
    }

    // if the “name” is optional, you can also do this in one line of
    // code and provide a default value, if “name” is not present:
    String name = xmlReader->GetOptString(“name”, “DefaultName”);

    // you can also read simple data types directly:
    int intVal = xmlReader->GetInt(“intAttr”);
    float floatVal = xmlReader->GetFloat(“floatAttr”);
    …

    // to read the current node’s content (<Node>Content</Node>):
    if (xmlReader->HasContent())
    {
        String content = xmlReader->GetContent();
    }

To create a new XML file, use an XmlWriter in a similar fashion:

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“temp:bla.xml”);
Ptr<XmlWriter> xmlWriter = XmlWriter::Create();
xmlWriter->SetStream(stream);
if (xmlWriter->Open())
{
    // write a node hierarchy, and add a few attributes to the leaf node:
    xmlWriter->BeginNode(“Nebula3”);
      xmlWriter->BeginNode(“Models”);
        xmlWriter->BeginNode(“Model”);
          xmlWriter->SetString(“name”, “A Model”);
          xmlWriter->SetInt(“intVal”, 20);
          …
        xmlWriter->EndNode();
      xmlWriter->EndNode();
    xmlWriter->EndNode();

    // close xml writer, this will also close the stream object
    xmlWriter->Close();
}

// you can also write a comment to the XML file:
xmlWriter->WriteComment(“A Comment”);

// or write the content enclosed by the current node:
xmlWriter->WriteContent(“Content”);

Additional notes on XML processing:

XmlReader and XmlWriter use TinyXml internally which has a tiny modification to read and write data through Nebula3 stream objects instead of the host filesystem

Reading large XML files can be very slow because of the thousands of small allocations going on for string data, thus reading XML files is not recommended for actual game applications, use optimized binary formats, or Nebula3’s database subsystem instead.

There’s currently no easy way to read an XML file, modify it and write it back.

Working with BinaryReader / BinaryWriter

The IO::BinaryReader and IO::BinaryWriter classes implement access to streams as a sequence of simple typed data elements like int, float, float4 or strings with automatic byte order conversion for different platforms:

// read data from a file using BinaryReader:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:bla.bin”);
Ptr<BinaryReader> reader = BinaryReader::Create();
reader->SetStream(stream);
if (reader->Open())
{
    uchar ucharVal = binaryReader->ReadUChar();
    float floatVal = binaryReader->ReadFloat();
    String strVal = binaryReader->ReadString();
    Math::matrix44 matrixVal = binaryReader->ReadMatrix44();
    Blob blob = binaryReader->ReadBlob();
    …
    reader->Close();
}

// writing data is just the other way around:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“temp:bla.bin”);
Ptr<BinaryWriter> writer = BinaryWriter::Create();
writer->SetStream(stream);
if (writer->Open())
{
    writer->WriteUShort(123);
    writer->WriteString(“Bla”);
    …
    writer->Close();
}

The byte order of BinaryReader/Writer is by default set to ByteOrder::Host (the host platform’s native byte order). You can enable automatic byte order conversion with the SetStreamByteOrder() method on BinaryReader and BinaryWriter. For instance, to create a binary file for one of the PowerPC-driven consoles from a tool running on the PC you would setup the BinaryWriter like this:

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“temp:bla.bin”);
Ptr<BinaryWriter> writer = BinaryWriter::Create();
writer->SetStream(stream);
writer->SetStreamByteOrder(System::ByteOrder::BigEndian);
if (writer->Open()) …

Automatic byte order conversion naturally doesn’t work for Util::Blob objects, since the reader/writer doesn’t know how the data inside the blob is structured.

Additional Notes:

Only use BinaryReader / BinaryWriter in an actual game project when absolutely necessary. For the sake of efficient and fast loading from disk it’s usually better to prepare any sort of game data as a native memory dump which can be loaded with a simple Stream::Read() and immediately used without any parsing or data conversions going on during load. Time spent in the build pipeline is a thousand times cheaper then time spent waiting for a level to load, thus BinaryReader and BinaryWriter are much better used in offline tools!

Even in offline tools, BinaryReader/Writer can be very slow when processing thousands of data elements, since reading or writing each little data element will cause a complete round-trip through the ReadFile / WriteFile functions of the host’s operating system. Use the SetMemoryMappingEnabled(true) method to speed up reading and writing of data elements drastically by caching the data in memory. In the BinaryReader, this will load the entire file into memory in Open(), and in the BinaryWriter, all writes will go into a memory buffer first, which will then be dumped to a file in Close() with a single Write().

Reading Excel XML files with Nebula3

You can use the IO::ExcelXmlReader stream reader class to parse files saved in XML format from MS Excel (all versions should work, but when in doubt, save as Excel 2003 XML file). After opening an Excel file, the content of the file can be accessed by table, column and row index:

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:excelsheet.xml”);
Ptr<ExcelXmlReader> reader = ExcelXmlReader::Create();
reader->SetStream(stream);
if (reader->Open())
{
    // NOTE: when working with the left-most “default” table,
    // we can simply omit the table index in all methods.
    //
    // Test if a column exists in the left-most table:
    if (xmlReader->HasColumn(“Bla”))
    {
        // get the column index (returns InvalidIndex if not exists)
        IndexT colIndex = xmlReader->FindColumnIndex(“Bla”);

        // iterate over all rows and read content of column “Bla”:
        IndexT rowIndex;
        SizeT numRows = xmlReader->GetNumRows();
        for (rowIndex = 0; rowIndex < numRows; rowIndex++)
        {
            // get the content of cell at (rowIndex,colIndex):
            String elm = xmlReader->GetElement(rowIndex, colIndex);
        }
    }
}

You can also access specific tables in an Excel file. In this case, an additional tableIndex parameter is used in most methods:

// get the number of tables in the Excel file:
SizeT numTables = xmlReader->GetNumTables();

// find a table index by name (return InvalidIndex if not exists)
IndexT tableIndex = xmlReader->GetTableIndex(“TableName”);
// get the name of table at index
const String& tableName = xmlReader->GetTableName(tableIndex);
// to access a particular table, simply add the table index as last
// parameter to all other methods:
SizeT numRowsOfTable = xmlReader->GetNumRows(tableIndex);
SizeT numColsOfTable = xmlReader->GetNumColumns(tableIndex);
bool hasColumn = xmlReader->HasColumn(“Bla”, tableIndex);
String elm = xmlReader->GetElement(rowIndex, colIndex, tableIndex);
…

Note that Excel XML files come with a lot more fluff then actual data, so reading an Excel table bigger then a few dozen or hundred kBytes is EXTREMELY slow. Do not use Excel XML files directly in your game application, but only as source data files in the content pipeline, and convert them to binary files or write them into an SQLite database during the build process for more efficient consumption by the game application.

How to register you own ConsoleHandler

Console handlers are attached to the IO::Console singleton to handle console output and optionally provide text input from a console. When Nebula3 starts up, a standard console handler is created in IO::Console::Open() which (under Windows) will route output to STD_OUTPUT_HANDLE and STD_ERROR_HANDLE, and provide non-blocking input from STD_INPUT_HANDLE. When not compiled with the PUBLIC_BUILD define, all text output will also go to OutputDebugString() and thus show up in the debugger, which is especially useful when running a windowed application (as opposed to a console application) under Windows.

On console platforms, the default ConsoleHandlers route all text output to the debug out channel.

Nebula3 provides 2 optional console handlers:

IO::LogFileConsoleHandler: can be used to capture all text output into a log file

IO::HistoryConsoleHander: captures console output into an in-memory ring buffer, this is currently used by the IO::ConsolePageHandler for the Debug-HTTP-Server in order to provide a snapshot of console output in a web browser connected to the N3 application.

Registering your own console handler is easy: just derive a subclass from IO::ConsoleHandler, override a few virtual methods, and call IO::Console::AttachHandler() with a pointer to an instance of your derived class early in your application.

Phew, I think that’s all the stuff that’s good-to-know when doing file IO in Nebula3. Please note that this was all about synchronous I/O. For asynchronous IO the IO::IoInterface interface-singleton is used, this simply launches a thread with its own thread-local IoServer, and IO operations are sent to the IO thread using Message objects (which in turn need to be checked for completion of the IO operation). It’s not much more complicated then synchronous IO. However I’m planning to do a few changes under the hood for asynchronous IO to enable better platform-specific optimizations in the future.

16 Jun 2009

Nebula3 RTTI Tips & Tricks

Note: I have omitted the namespace prefixes and ‘using namespace xxx’ statements in the code below to improve readability. Also, since I didn’t run the code through a compiler, there may be a lot of typos.

Don’t be confused by Rtti vs. RTTI:

Rtti is the class name, MyClass::RTTI is the name of the Rtti-object of a class. Every RefCounted derived class has exactly one static instance of Core::Rtti, which is initialized before main() is entered.

Check whether an object is an instance of a specific, or of a derived class:

This is the standard feature of the Nebula3 RTTI system, checking whether an object can safely be cast to a specific class interface:

// check whether obj is instance of a specific class:
if (obj->IsInstanceOf(MyClass::RTTI)) …

// check whether obj is instance of class, or a derived class:
if (obj->IsA(MyClass::RTTI))…

Compared to Nebula2, N3’s RTTI check is extremely fast (in N2, it was necessary to convert a class name string into a pointer first before doing the check). In N3, RTTI checks are simple pointer comparisons. The IsA() method is a bit slower if the class doesn’t match since it needs to walk the inheritance hierarchy towards the root. Because of this it is always better to use the IsInstanceOf() method if possible, since this is always only a single pointer comparison.

Both methods also exist as class-name and class-fourcc versions, but of course these are both slower then the methods which directly work with the RTTI objects:

if (obj->IsInstanceOf(“MyNamespace::MyClass”)) …
if (obj->IsInstanceOf(FourCC(‘MYCL’))…

if (obj->IsA(“MyNamespace::MyClass”))…
if (obj->IsA(FourCC(‘MYCL’))…

Using Ptr<> cast methods for safe-casting:

The Ptr<> class comes with 3 cast methods, 2 for safe up- and down-casts, and one unsafe-but-fast C-style cast. To do a down-cast (from a general parent class down to a specialized sub-class) you can do this:

// assume that res is a Ptr<Resource>, and safely down-cast
// it to a Ptr<D3D9Texture> (D3D9Texture is a subclass of Resource):
const Ptr<D3D9Texture>& d3d9Tex = res.downcast<D3D9Texture>();

This will generate a runtime-error if tex is not a D3D9Texture object.

Safely casting upwards in the inheritance hierarchy works as well:

const Ptr<Resource>& res = d3d9Tex.upcast<Resource>();

An unsafe C-style cast is done like this:

const Ptr<Resource>& res = d3d9Tex.cast<Resource>();

An unsafe cast is the fastest (in release mode, the compiler should optimize the method call into nothing), but of course it also makes it extremely easy to shoot yourself in the foot. The 2 safe-cast methods call the Rtti::IsDerivedFrom() method, no temporary Ptr<> object will be created since they return a const-ref.

Query RTTI objects directly:

You can directly query many class properties without having an actual object of the class around:

// get the name of a class:
const String& className = MyClass::RTTI.GetName();

// get the FourCC identifier of aclass:
FourCC classFourCC = MyClass::RTTI.GetFourCC();

// get a pointer to the Rtti object of the parent class
// (returns 0 when called on RefCounted::RTTI)
Rtti* parentRtti = MyClass::RTTI.GetParent();

// check if a class is derived from this class:
// by Rtti object:
if (MyClass::RTTI.IsDerivedFrom(OtherClass::RTTI)) …
// by class name:
if (MyClass::RTTI.IsDerivedFrom(“MyNamespace::OtherClass”)) …
// by class fourcc:
if (MyClass::RTTI.IsDerivedFrom(FourCC(‘OTHR’))…

You can check two Rtti objects for equality or inequality:

const Rtti& otherRtti = …;
if (MyClass::RTTI == otherRtti)…
if (MyClass::RTTI != otherRtti)…

Since it is guaranteed that only one Rtti object exists per class this is equivalent with comparing the addresses of 2 Rtti objects (and that’s in fact what the equality and inequality operators do internally).

Create objects directly through the RTTI object:

Ptr<MyClass> myObj = (MyClass*) MyClass::RTTI.Create();

The old-school C-style cast looks a bit out of place but is currently necessary because the Rtti::Create() method returns a raw pointer, not a smart-pointer.

Creating an object through the RTTI object instead of the static MyClass::Create() method is useful if you want to hand the type of an object as an argument to a method call like this:

Ptr<RefCounted> CreateObjectOfAnyClass(const Rtti& rtti)
{
return rtti.Create();
}

This is a lot faster then the 2 other alternatives, creating the object through its class name or class fourcc identifier.

Create objects by class name or FourCC identifier

You can use the Core::Factory singleton to create RefCounted-derived objects by class name or by a FourCC identifier:

Ptr<MyClass> obj = (MyClass*) Factory::Instance()->Create(“MyNamespace::MyClass”);

Ptr<MyClass> obj = (MyClass*) Factory::Instance()->Create(FourCC(‘MYCL’));

This is mainly useful for serialization code, or if the type of an object must be communicated over a network connection.

Lookup the RTTI object of a class through the Core::Factory singleton

You can get a pointer to the static RTTI object of a class by class name or class FourCC identifier:

const Rtti* rtti = Factory::Instance()->GetClassRtti(“MyNamespace::MyClass”);

const Rtti* rtti = Factory::Instance()->GetClassRtti(FourCC(‘MYCL’));

This will fail hard if the class doesn’t exist, you can check whether a class has been registered with the factory using the ClassExists() methods:

bool classExists = Factory::Instance()->ClassExists(“MyNamespace::MyClass”);

bool classExists = Factory::Instance()->ClassExists(FourCC(‘MYCL’));

Troubleshooting

There are 2 common problems with Nebula3’s RTTI system.

When writing a new class, it may happen that the FourCC code of the class is already taken. In this case, an error dialog will popup at application start which looks like this:

To fix this collision, change the FourCC code of one of the affected classes and recompile.

The other problem is that a class doesn’t register at application startup because the constructor of its static RTTI object has been “optimized away” by the linker. This happens when there’s no actual C++ code in the application which directly uses this class. This is the case if an object is created through N3’s create-by-class-name or create-by-class-fourcc mechanism and the class is only accessed indirectly through virtual method calls.

In this case the linker will drop the .obj module of this class completely since there are no calls from the outside into the object module. That’s a neat optimization to keep the executable size small, and it works great with the static object model of plain C++, but with Nebula3’s dynamic object model we need to trick the linker into linking “unused” classes into the executable. We don’t have to do this for every RefCounted-derived class fortunately, only for specific parts of the inheritance hierarchy (for instance subclasses of ModelNode and ModelNodeInstance in the Render layer, or subclasses of Property in the Application layer)

To prevent the linker from dropping a class the following procedure is recommended:

add a __RegisterClass(MyClass) macro to a central .h ‘class registry’ header file
include this header file into a .cc file which definitely won’t be dropped by the linker

The header file /nebula3/code/render/render_classregistry.h is an example for such a centralized class registry header.

6 Jun 2009

Tidbits

I must admit that I’m completely hyped for the Xbox team’s new “interaction system” Project Natal, especially since this guy has been part of the team (Microsoft sure knows how to grab the relevant people). One has to look through the silly demos though (using a car racing game to demonstrate “controller free gaming” is kind of stupid when in the real world you’re sitting *inside* a huge freaking controller, which has a much more complex interface then any input device used for gaming and requires extremely skillful use of both hands and feet…).

It’s fascinating to think about how much more information a game suddenly has about the world on the other side of the wall. With mouse/keyboard, a game pad, or even a Wiimote, the button presses, pointer positions, analog sliders and acceleration vectors only provide a handful of values, a few hundred bits of information maybe. That’s like a few tiny lights in a vast dark ocean. But if the currently available infos are somewhat correct, Natal provides per-pixel color and depth-information as well as a microphone array for spotting sound sources, everything pre-processed in meaningful ways by the software (that’s the important part which differentiates it from something simple like a web cam). That’s like a sunrise in the dark ocean! The game is suddenly able to see and hear what’s happening in that strange parallel universe called “real world”, and the software (which is hopefully provided with the 360 SDK) is the brain which interprets that information. That’s another great thing, all the interesting stuff happens in the software and can be improved with a simple software update, instead of trying to sell another hardware addon like the Motion Plus to baffled players.

Now the only question is how to prevent walking into the Wii trap and not flood the 360 with stupid mini-games (IMHO mini-games are to game designers what the crates are for level designers: their last resort if they run out of ideas). It would be a terrible shame if this great technology is only used for Wii Play clones and silly “family games”. If Natal is rejected by the 360’s native hardcore crowd (yes I said the dirty word, there is no such thing as a “casual Xbox player”), it will soon fade into oblivion just as the current Vision Camera. I really doubt that Natal will cause casuals to forget about their Wii and convert to the 360 en mass. That ship has sailed long ago, maybe it could happen with the next console generation, or if Microsoft does a really big re-launch of a theoretical “Natal 360” which needs to look a lot slicker and sexier then the current 360. Any way, for the games audience, I think Microsoft should first try to sell Natal to the hardcore, and that’s not done with some silly ball catching or tennis demos, but with full-frontal science fiction shit like this: http://www.youtube.com/watch?v=UUuRqzVhJmA (although this is a conventional multi-touch device, not Natal, but you get the idea). For marketing to the casuals, concentrate on the 360 as a media device (now that was the right step in one of the demo videos, demonstrate how to watch a movie without having to the mess around with remote controls – because the one thing that’s much more complicated then a game pad is a remote control with dozens of tiny buttons).

I think the best way to design a Natal game is to pretend that gamepads, keyboard and pointing devices never existed, go back to the 60’s or even the 50’s and create some sort of alternate future without keyboards, mice and gamepads. This is hard and will take a lot of time, since our current gaming genres are the result of decades of joint evolution of game play and input methods. Simply putting a new input scheme over existing games won’t work very well (I think the Wii has shown this already). But on the other hand, the first thing that you’re telling a PC game designer or programmer is to forget that mouse/keyboard ever existed when he’s starting to work on a console game and fully embrace the platform’s controller as if it would be the only input device in the whole universe. Switching to this new state of mind can be incredibly hard and some will never be able to do it, but with an open mind this often works surprisingly well.

Some unrelated stuff:

Man, I hope 70 Euros isn’t becoming the new standard price for Xbox360 games in Germany (that’s about 100 US-$!). 60 Euros was already a shocker when the 360 launched. 70 Euros is waaayyy beyond impulse-buy-territory for me, especially when there’s only a few hours of entertainment in a game which is becoming more and more common. I’m currently extremely selective when buying new games, I think I only bought one or two new games this year, instead of almost one game per month in the past. I can’t imagine that this pricing strategy is good for the 360-platform in Germany, since people will only buy the very few big blockbusters and skip everything else, or get the cheaper (and most often better) UK versions.
Very cool news: Bethesda will offer the Fallout 3 DLC on discs! I have skipped all but the first addons for Fallout 3 because they’re only available in the German language version. With the disc release I can finally order the proper English version and don’t have to live with the poor German localization.
My wish from last year has been granted! Full game downloads on Xbox Live starting August 2009! Maybe I should consider a career as a fortune-teller or analyst hehe…
We got a couple of those slick new PS3 devkits (the new ones in standard PS3 cases, not those older 19” rack monsters) and have started the Nebula3 PS3 port already. We’re putting a proper team of 4 behind this so the circle will soon be complete :) I’m quite happy with the progress so far. The SDK is very complete, the tools are nice enough (pretty good VStudio integration, thank God). Not quite as cool as the 360 stuff, but that’s more a question of personal taste and probably not objective. The PS3 SDK seems to have made great strides since the PS3 launch, and I think many of the negativity on the internet surrounding the PS3 development stems from those early days (which we haven’t experienced ourselves).

That’s it for now, I’m going to nose around the PS3 SDK docs a bit now, perfect activity for a Saturday afternoon :)

-Floh.