The Brain Dump: 2009

10 Nov 2009

Drakensang River Of Time - Personal Edition

Check this out: www.amazon.de/amflussderzeit

Amazon starts pre-ordering for the super-limited German *personalized edition* of “Drakensang - River Of Time” with your own name on the box. First time any game offers this, as far as we’re aware at least. I really, really dig the DVD box cover by the way. We got some of those posters a few days ago and they’re simply admirable :o)

5 Nov 2009

Nebula3 SDK Nov 2009 Changelog

Here’s the new N3 SDK: download link

As always, this only includes the sources for the Win32 platform. Console platform specific source code (Xbox360, PS3 and Wii) is not included for obvious legal reasons.

Here’s a rough change log since the Apr2009 SDK:

== Major New Features

unified XNAMath support on Win32 and Xbox360 platforms
PS3 support (not part of public SDK, but lots of fixes for GCC 4.x in platform-agnostic code)
HTTP filesystem wrapper now working properly, this allows to create standalone N3 apps which load all their data from an HTTP server (see testhttpviewer.exe for an example)
“binary XML” support for much faster loading of big XML files (circumvents TinyXML)
new “FrameSync” system for running main and render thread in lock-step
new “Jobs” system to implement parallel jobs (CPU-thread-pool on Win32 and Xbox360, SPUs on PS3)
window parenting, it’s now possible to open the render window as a child of another window, this makes it possible to embed N3 into another Windows application
FMOD integration

== Foundation Layer

Core

new Debug::StringAtomPageHandler to inspect string atom table from web browser
type casting methods optimized in Ptr<>
optional allocation from memory pool support for RefCounted objects (currently unstable!)

Util

Win32StringConverter: helper class to convert between UTF-8 and wide character string (currently only on Win32 platform)
Util::Array::InsertSorted() now returns index of inserted element
new Util::BitField<> class to allow bit mask operations on masks wider then 32 bits
removed classes of old string atom system: Util::Atom<>, Util::Proxy<>
new method Util::FixedArray<>::Resize()
classes for new string atom system: Util::StringAtom, Util::StringBuffer, Util::LocalStringAtomTable, Util::GlobalStringAtomTable
new method Util::Queue<>::Reserve()
new direct access methods in Util::RingBuffer<>
new method Util::Round::RoundUp()
new class Util::SparseTable, for 2D data tables with a lot of empty cells
Util::String:
- new method CopyToBuffer()
- new optimized versions of Tokenize() which fills a provided string array with the tokens, allows to reuse an existing array object
- new static wrapper methods: IsDigit(), IsAlpha(), IsAlNum(), IsLower(), IsUpper(), StrCmp(), StrLen(), StrChr()
new util functions to help with “type punning”

IO

BXmlReader: stream reader for “binary XML files” (created by the new binaryxmlconverter3.exe utility)
some low-level-optimizations in ZIP filesystem wrapper
application root directory stuff moved from AssetRegistry into Core::CoreServer
new class GameContentServer, used to properly setup game data on some console platforms
added support for http: and httpnz: schemes for reading data from HTTP servers through the N3 filesystem wrapper

Math

Xbox360 and Win32 math classes have been unified into XNAMath classes
low-level performance tweaking

Memory

experimental memory pool support on Win32 platform
on Win32 platform, dynamically allocated memory is now 16-byte aligned (NOTE: there seems to be a hard to reproduce critical bug in Realloc() where HeapSize() returns a wrong value)
new HTML debug output in Debug::MemoryPageHandler for memory pools

Threading

Threading::CriticalSection rewritten with "Fast critical sections with timeout" by Vladislav Gelfer (on Win32 platform)
Threading::Event now supports “manual reset” behaviour
Threading::Interlocked class now uses compiler intrinsics on Win32 and Xbox360 platform
new class Threading::ThreadBarrier: stops a thread until all other threads have arrived at the barrier
optimizations in Threading::SafeQueue

System

new class System::Win32Environment to access environment variables (Win32 platform only)
Win32Registry class now reads registry key values as wide char and converts them to UTF-8
type punning fixes in System::ByteOrder

Timing

removed MasterTime/SlaveTime system, global timing is now provided by the FrameSync subsystem

Messaging

the Message::SetHandled() method was not thread-safe, now uses Interlocked::Exchange() to update its status
Messaging::AsyncPort rewritten to allow better control over message handling behaviour through subclasses of HandlerThreadBase
new async message handler thread classes BlockingHandlerThread, RunThroughHandlerThread

Net

some type punning fixes in debugpacket.cc

Http

new classes HttpClientRegistry, HttpStream, HttpNzStream to implement a transparent HTTP filesystem, the HttpNzStream uses client-side zlib decompression to improve download performance

Debug

no noteworthy changes

App

new application identifier strings AppTitle and AppID, this is necessary for some console platforms

Jobs

this is a new subsystem to distribute tasks either across threads in a thread-pool (Win32 and Xbox360) or the SPUs on the PS3

FrameSync

this is a new subsystem which implements better synchronization between the game thread and render thread

== Render Layer

CoreGraphics

parent window stuff in DisplayDevice (Win32 platform)
it’s now possible to share depth/stencil buffers between render targets
the ShaderServer now parses a dictionary file (created by the shaderbatcher3.exe tool) instead of listing the directory content of the shaders directory
removed array support from shader variables (shader parameter arrays are not very portable)
some restructuring because of the PS3 port (some classes have been split into a base class and platform specific derived classes)
new private method in D3D9RenderDevice: SyncGPU() this is called inside Present() to prevent the GPU from running too far ahead of the CPU (this is a driver-internal “optimization” which can lead to frame stuttering under some circumstances)
better control over clearing a render target through clear flags
the RenderDevice::SaveScreenshot() method is now responsible to set the MIME type on the output stream, this is because the actually saved MIME type may now be different then the requested type
no more byte-order conversion when loading mesh files, this happens in the asset pipeline now
new class MemoryMeshLoader, setup a mesh object from an existing VertexBuffer and IndexBuffer object

CoreAudio and Audio

the CoreAudio and Audio subsystem are obsolete and have been replaced with the FMOD-based Audio2 subsystem, which “automatically” works across all platforms, please check the FMOD license restrictions for commercial projects!

CoreAnimation

the following classes have been removed from CoreAnimation: AnimDrivenMotionSampler, AnimMixer, AnimSampler
new file format for animation data: nax3
new animation curve type: Velocity, this is used by the AnimDrivenMotion feature

Input

no noteworthy changes

Frame

minor changes for Pre-Lightpass-Rendering
better control over render target clear in FramePass
FramePostEffect: rendering a full-screen-quad has been moved into new helper class RenderUtil::DrawFullScreenQuad
frame shaders are now loaded on demand
new LightServer class: LightPrePassServer implements light pre-pass rendering (a variation on deferred shading, currently only implemented in the PS3 port)

Animation

anim evaluation has been “jobified”
no more AnimDrivenMotion specific code in Animation subsystem (this is now handled through a new anim curve type which contains velocity keys)

Audio2

new FMOD-based multiplatform audio subsystem

Characters

skeleton evaluation has been “jobified”
on PS3, skinning is now running on SPUs
the entire character subsystem has been optimized (essentially rewritten)

InternalGraphics

uses the FrameSync subsystem to run render thread and game thread in lock-step (this basically fixes all stuttering problems)
more debug infos displayed in web browser through GraphicsPageHandler
lots of fixes to the attachment system (character joint attachments: swords, shields, etc…)
restructured the Update/Render-Loop for better parallelization support, the idea is basically to make more room between updating an object and rendering an object so that asynchronous jobs have a better chance to finish on time before rendering requires the jobs output data

Graphics

some new messages to communicate from the main thread to the render thread (see graphicshandler.cc)

Models

nothing noteworthy…

Particles

some restructuring for better portability
particle updates have been “jobified”

RenderModules

no noteworthy changes…

RenderUtil

new helper class RenderFullScreenQuad
new helper class NodeLookupUtil to lookup a ModelNodeInstance in a hierarchy

Resources

nothing noteworthy

== Moved into Addons:

fx
network
locale
posteffect
ui
vegetation
vibration
video

== New Stuff in ExtLibs:

FMOD
RakNet

Enjoy!

23 Oct 2009

Drakensang River Of Time Intro

Here’s the intro video of our new Drakensang game (River Of Time):

(Go here for the bigger version)

The cutscene has been created in-engine with our new cutscene editor tool described in one of my former blog posts, and then captured frame by frame and encoded with Bink at 1280x720 (the YouTube version unfortunately looks quite a bit darker then the original if I’m not mistaken). The characters have higher resolution meshes and textures created especially for the intro video, but the underlying joint skeleton and facial animation system is identical with the ingame-characters. The decision to encode the cutscene into a video stream instead of running it in real-time was done early in the project to remove a few risks. We couldn’t be sure what the performance would look like with the high-res assets and all the dynamic lighting, and whether or not a lot of post-processing would be necessary after capturing the raw frames. Turns out that the real-time cutscene looks so good that no “cheating” was necessary, so in future projects we will probably do everything in real-time from the beginning.

The advantage of building and tweaking the cutscene with an instant real-time preview can’t be stressed enough. The intro to the original Drakensang was done the traditional way, short scenes have been built in Maya, rendered over night, and then arranged and cut in some video editing tool. The massive turn-around time between tweaking something and seeing the result was a huge problem and in the end we ran out of time. Creating the new intro video was completely painless and straight-forward. The artists actually had fun creating it (at least that’s the impression I got watching them from time to time hehe), and I think that’s clearly visible in the result :)

10 Oct 2009

Ninja Gaiden Sigma 2

I’ve played halfway through the campaign of NGS2 yesterday evening, with mixed emotions. It’s very obvious that the director of this game has a very different vision of Ninja Gaiden then Itagaki. Sometimes for the better, but most of the times I would not call the changes actual improvements. It’s surprising how many gameplay elements which worked well have been removed. I hope that some of the shortcomings will be fixed in the higher difficulty levels (I’m currently playing on Warrior difficulty).

The Good:

Graphics have improved dramatically! The game generally looks a lot crisper (I guess that NG2 was upscaled, and NS2S is native 720p), textures seems to be higher resolution, normal mapping and specular highlight effects have been tuned, it’s really a difference like night and day (in some cases literally, when returning to the Ninja village in chapter 2 it is now broad daylight, not night time). This is what the original NG2 should have looked like.
New minions: There are a couple of new enemies in the game, some variations of NG2 minions, and some variations of original Ninja Gaiden monsters.
New and tweaked bosses: There are a couple of new boss fights in the game which are variations of the final boss in the original NG2. The Genshi fights are more interesting (you can now do an Izuna Drop on Genshi for instance).
Excessive rocket spam removed: that’s about the only good gameplay change, almost all of the “unfair” rocket spam has been removed.
Mission Mode now included in the game, plus Coop: Since the original NG2 wasn’t released in Germany, there’s also no downloadable content on the Xbox Live marketplace (damn you Microsoft). NG2S includes a mission mode, and adds 2-player-coop to those missions.

The Bad:

Much fewer enemies on screen: this is most painful change. The oh-shit-moments of NG2 when 15 blood-thirsty spider-clan-ninjas where rushing down a hallway, running into Ryu which is starting an Ultimate Technique, turning the whole screen into a mess of flying body parts, and then cleaning up the survivors with a series of Obliteration moves. That’s no longer happening in NG2S. Typically, there are no more then 3 or 4 enemies on screen. Generally, combat encounters are much shorter and easier then in NG2.
Slowdown, tearing and in-level loading still there: The dreaded slowdown from NG2 still happens, it’s not happening so often as in NG2, but only because the number of enemies on screen and rocket spam has been reduced. When the slowdown happens it even kicks in sooner, with less on-screen action, then in the original NG2. This is a big let-down. There are places with screen-tearing even when there are no enemies on screen, and the game still pauses mid-game to load data. It’s not game-breaking but disappointing considering that the team had over a year to tweak and optimize.
Empty hallways: A lot of combat encounters have been removed from the game, locations which were packed with enemies in the original are desolate in NGS2. I really hope this is just a difficulty level thing, and that there are more encounters in the higher difficulty levels.
Fewer choices: some of the design decisions are downright stupid:

Weapon upgrades are now limited to a few shops, and the first time any weapon can be upgraded to level 3 is very late in the game (at the start of the Moscow chapter). Weapon upgrades don’t cost any money now however, sounds good at first, but money and shops in general quite useless now (at least in Warrior difficulty).
Life Of The Thousand Gods is now immediately activated when picked up, and no longer refills the life bar. This removes a very nice tactical elements from the game (should I use the immediate benefit of having a longer life bar, or should I use it as an additional health potion during the next boss fight)?
Same with the Life Of The Gods items, it’s no longer possible to manually activate them when needed, instead they auto-activate when picked up.

The Rest:

I actually like the new blood-effect-replacement and toned down violence, it makes the game more arcady and more enjoyable (IMHO).
One really starts to appreciate how good the 360’s controller is after 5 or 6 hours of playing with the PS3 controller. My left hand literally hurt after the session.
The additional campaign chapters with the new playable characters are disappointing. It’s too little to feel comfortable with the new characters and their moves, they’re limited to a single weapon, and the levels are mostly reused from the original game.
They actually tried to fix the story LOL. There’s a “prelude comic” during installation and a few fixes to the cut-scenes during the game to make the story more comprehensible and give it more of a background… as if anybody gives a shit about the story in a Ninja Gaiden game. The result is a complete mess. I can imagine a game featuring ninjas where story actually plays an important part, but not in the Ninja Gaiden universe. It’s really too late for that hehe. The only thing that’s really missing in Ninja Gaiden are pirates, oh … and zombies of course.

If NGS2 is an indication of what the future of Ninja Gaiden looks like without Itagaki then I’m out. The game looks shiny, but the changes to the core game-play are all aiming into the wrong direction for my taste. I was hoping that NGS2 becomes what Ninja Gaiden Black was to the original, a real improvement to an already great game, and if there’s any game which really needs a good polish, it is the original NG2. But NGS2 adds only very few improvements, and abandons too many good ideas from its predecessor. It’s just different, not better then NG2.

6 Oct 2009

Tools of the Trade

We have ramped up tools development at Radon Labs considerably during the development of the two Drakensang games. Traditionally we have been (and still are) a bit conservative about inhouse tool development and try to avoid re-inventing wheels as much as possible. Each new custom-tool requires permanent maintenance work and if a “standard industry tool” exists for a job it is usually better to just use this. But especially in the domain of “game-logic stuff” there are basically no standard tools, so the situation is much more dire compared to graphics or audio tools.

We have traditionally been using Excel tables and XML files compiled into an SQLite database for “game data”. This works pretty well if the number of “work items” is around a few hundred. But for complex RPG games, the number of work items is in the range of tens-of-thousands (quest tasks, dialog takes, voice over snippets, graphics objects, textures, items, NPCs, monsters, behaviour scripts, and so on…). Excel tables and raw XML files don’t scale up very well because they lack game-specific features for filtering, searching, statistics, and of course the possibility of human error is very high, and finding and fixing those errors isn’t much fun either.

The Texture Tool

The very first custom tool in C# (to try the waters so to say) was a simple replacement for an Excel table which defined texture attributes (stuff like the size and DXT format of a texture, mip-map quality and so on). The Excel table allowed to give each individual texture its own set of attributes. For a small project with a few hundred textures this works pretty well. Drakensang has around 15,000 textures however, and this is way beyond the territory where Excel is starting to become a pain in the ass. Thus we wrote this:

The tool manages the same information as the Excel table it replaces, but has much better filtering and manipulation features. The left hand sides gives a tree-view of all texture categories, and it’s possible to either display all textures, or textures of a given category, and it’s also possible to further filter the textures by a pattern (for instance to display all normal textures of the “armor” category, click on “armor” in the tree view and type “*_bump.*” into the top most line of the “File” column. Entering a value into one of the other top-most column-entries will set this attribute value to all displayed textures. It’s a very simple tool, but it’s easy to use and scales up very well to tens-of-thousands of textures while still making it possible to find and tweak a single attribute of a specific texture with ease.

The Story Editor

When we started planning Drakensang, we knew that we would need new tools for creating quests, dialogs and game logic scripting. On the surface, these 3 things are something completely different, but for the game core, quests, dialogs and scripts aren’t that different (the common element is the use of Conditions (small C++ objects which check whether some condition in the game world is true) and Actions (similar C++ objects which manipulate the game world)). Thus the Story Editor was born (actually, it’s a Dialog/Quest/Script editor). The Story Editor loads and saves XML files, which are compiled into the game database during the build process, or directly from the tool to immediately check the results in the game. We couldn’t anticipate all required features for the story editor at the start of the project, thus the editor was constantly worked on during the development of Drakensang (we added a lot of features to improve localization, proof-reading or voice-over integration for instance).

Note that the Story Editor currently suffers a bit from the typical “inhouse tool UI aesthetics problem” ;)

Here’s a screenshot of the Story Editor when working on a simple quest:

On the left is the simple linear task list, the panels on the right show the attributes of the currently selected task (for instance the conditions and actions associated with the task).

Here’s the screenshot of the Story Editor with a simple action script:

We opted against a traditional scripting language, but instead used a “point and click” approach. A script is just a simple collection of Conditions and Actions. These could just as well be wrapped into LUA functions for instance. Our current approach is probably not as powerful as real scripting system, but definitely less error-prone and easier to control.

Finally here’s the Story Editor when working on a dialog:

Dialog takes can be associated with Conditions (to show or hide dialog takes based on some in-game condition), or Actions (to manipulate the game world as the result of a conversation). The Comment tab on the right side is most useful to add instructions for audio recording sessions.

The Sequence Editor

In the first Drakensang, in-game cutscenes had been hand-scripted by defining a sequence of Actions in the Story Editor, basically in the most un-intuitive way imaginable. For the new Drakensang, we wrote a whole new “Sequence subsystem” along with a new tool (the Sequence Editor) which gives control over cutscenes back into the hand of the artists and provides a very intuitive workflow with instant turnaround times (resulting in a dramatic quality improvement if I may say so). Here’s a screenshot of the Sequence Editor in action on a dual-monitor setup:

On the left-hand side is the actual editor, on the right-hand side is the ingame-preview. Tool and game communicate through XML messages over a TCP/IP connection, changes in the tool immediately show up in the preview. The other direction works as well, for instance it is possible to set keys for the camera or objects directly in the preview window, complete with snap-to-ground and other time-savers.

Here’s a more detailed shot of the Sequence Editor:

“Trackbar Elements” from the bottom area (Play Sound, Depth Of Field, etc…) can be drag’n’dropped into the timeline area above. Every trackbar element comes with a number of attributes, most of them can be animated over time. Trackbar attributes are displayed in the right-side column (in the screenshot, I have selected an “Ambience Bubble” element which allows to manipulate the ingame lighting, posteffect parameters, and other stuff which is important to set the visual “mood” of a scene). Animated parameters are edited in the Graph Editor which looks and feels very similar to Maya’s graph editor window:

It’s important to note that complex character animations are not created in the Sequence Editor, instead the sequence system will usually only trigger existing character animations from the character’s animation library. How complex character animation tracks are is up to the cutscene designer. Sometimes its better to fire a lot of small, generic animations and do the path-animation in the Sequence Editor, sometimes it’s better to generate one big motion-capture animation for the entire cutscene, complete with the actual character movement.

One very cool feature of the sequence system is its extensibility. Simple trackbar elements, like “Depth Of Field” or “Camera Shake” can be implemented in under an hour by extending an XML config file which describes the attributes of the new element to the editor tool, and by writing a very simple subclass in the engine which connects the animated values of the trackbar to another subsystem (for instance the PostEffects subsystem in case of the Depth Of Field trackbar).

The Table Editor

The new Table Editor is aimed at finally replacing the Excel tables for game object template data. In Drakensang (or Mangalore in general) a game object’s persistent state is simply described by a collection of typed key/value-pair attributes. Different game object categories have different sets of attributes. For instance a Monster game object has a different attribute set then a Weapon game object.

Traditionally, we had one Excel table per game object category, and one line in the table per game-object template. The value of some of the attributes (instance attributes) can be set in the Maya level editor to a different value for each game object instance.

Some of the bigger game object tables became a really critical bottleneck during the production of Drakensang, since only one person could work at a table at a time. Also even though the tables were saved in XML format, both CVS and SVN are pretty much useless for merging.

Thus the idea was born to move the game data tables into a true database backend, and then create one generic “Table Editor” and later on, several smaller, specialized Editor tools which basically provide a different view into the database. This is also the very first step of moving away from Maya as a level editor towards a truly collaborative level design environment.

Here’s a screenshot of the generic Table Editor tool:

On first glance, this just looks like a nifty Excel replacement, but under the hood it is much more:

There’s a complete user system with login and access rights (in the screenshot above, I can see that I am currently logged in as “floh”, and that 1 other user is currently editing the Weapon-Table).

Then there’s a complete revision control system. The table basically extends into the time dimension, and it’s possible to inspect the table at any point in the past. There’s also a very detailed change-log, which not only tells me who modified the table at what time, but also all the changes made to the table:

Of course it’s also possible to revert the content of any cell to any point in time.

When a level designer changes values in a table, it will only happen in a local sandbox (the changed values already live in the central database, but are marked as local data of the user). This allows a level designer to mess around with the tables without affecting the other users. When everything works as expected, the changes are committed into the “global domain” and the new values are visible to all.

The generic database table editor tool is very powerful, but the whole system really shines with specialized front-end editors like this:

This is basically just a different frontend to the global game database with a specialized GUI for efficiently equipping characters and chests in the game. The general idea here is that such a specialized editor tool is written for tasks where working with the generic editor is not desirable because it would be too cumbersome, un-intuitive or error-prone.

There’s also a lot of interesting work going on in our build-pipeline at the moment, but I’ll leave this to another post :)

13 Sept 2009

Workload

I’ve been playing around with getting N3 up and running in a pure web environment, running without local installation, pulling all data from a web-server instead of the local hard-disc, embedding rendering into a web-browser and so on. Works pretty well so far, but I’m shocked how arcane and downright silly writing a plug-in for Internet Explorer is. I can’t believe MS hasn’t released a simplified, specialized plug-in API for IE by now, but instead one is still required to dive right into the disgusting cesspit that is ActiveX / OLE. I pity the poor souls who had to make a living in the 90’s grinding on code for OLE or CORBA. Compared to .NET today, this really was software development hell.

On the other hand, NPAPI, the plug-in API for everything else then IE, also hails from the 90’s but it’s clear that it was designed by sane people, and to do one thing right instead of all things poorly: to let people write plug-ins for the Netscape browsers. Embedding N3 into Firefox was a matter of hours. But I already wasted 2 weekends even getting a clear idea how to do the same thing in IE, and every time I’ve finished another doc-reading-session I feel like I must wash my hands.

Gaming! If not for XBLA I would have considered to give up console gaming in the past months and return into the PC camp. Game prices for the 360 and PS3 are simply hilarious right now in Germany. 70 is the new 60 for quite a while now (and 60 Euros was already the outrage when “next-gen” started). Combined with the fact that it feels like half of the games are not even released in Germany it really doesn’t make a lot of sense to own a 360 (or PS3). The new Games On Demand service on Xbox Live just makes things worse. Of course you only get the crappy German versions of games (remember, there’s a hard IP address region check on Xbox Live), and whoever thought that 30 Euros for a 3 year old launch game is a good idea clearly lost touch with reality.

Thank god the UK exists, free haven and last stronghold of console gaming in Europe, where prices are reasonable and censorship doesn’t exist. England is probably saving the 360 in Germany right now. God Save The Queen :)

But not all is bad in the console world. XBLA has reinvented itself during the summer. Games like Battlefield 1943, Trials HD, and Shadow Complex catapulted the service to a new quality and sales numbers level. IMHO, XBLA was in dire danger to become a dump for cheap-ass retro-titles, but since Battlefield everything has changed. I’ve had more fun with Shadow Complex then with Gears 1 and 2 combined. I hope the success of SC means Gears 3 will become a bit more interesting (of course I don’t want a side-scrolling Gears, but I’d like to see more exploration and maybe some character up-leveling).

Dirt2 was a day-one for me, but at first I was terribly disappointed because I was expecting a somewhat arcady rally game like the original Dirt. Everything which I liked about Dirt was removed and more stuff was added which I clearly didn’t like. Especially the X-TREME bullshit: didn’t like the crap you’re co-pilot was spilling before the race (I’m Mister Smooth, you’re Mister Steady)? Well, now the game is completely full of this shit. I’m only halfway through Dirt2, so I’m not exactly sure, but they removed Pikes Peak (and hill-climbing all together), as well as all the amazing “pseudo real world” tracks in Europe, and replaced them with complete fantasy tracks in more X-TREME locations like China and Malaysia. I’ve been playing a bit of the original Dirt again, and the game still looks pretty darn good compared to Dirt2. The car models in Dirt2 are clearly better, the environment in Dirt2 is more detailed, but for some reason don’t look as “realistic” as in Dirt1. The European tracks in Dirt simply nailed the look of a dark German forest during a rainy day. Dirt2 looks a lot more like a typical video game (Malaysia is a looker, though). Well, I still made my peace with Dirt2. It’s not a rally game, it’s a pure arcade racer now. It looks very good, it plays well, its fast paced, and it has a really good multiplayer mode. But the name Colin McRae does not belong any longer on Dirt2’s box.

Now that the rally-genre is completely dead on the consoles I really wish Turn10 would step in and settle the issue once and for all. Give us Forza Rallysport already. Forza’s driving model is too good for just one game :)

I gave in to the hype and am currently about halfway through Arkham Asylum. My brain hasn’t been “indoctrinated” by American comics during my childhood. As a result I find that whole super-hero thing completely silly. I never enjoyed a single Marvel-licensed game (I think my record was like 20 minutes into one of the Spiderman games), and with few exceptions, all of the movies were utter crap (especially the Batman movies, haven’t seen the last one yet though). So… I was a bit skeptical about yet another super-hero game, to say the least. Well, what can I say? The game is fucking great! It’s a bit too old-school here and there (for instance, there’s A LOT of air duct crawling in the game). Sometimes I believe there’s only one guy in the gaming industry who’s doing all the air-ducts. He’s probably started his career in Half-Life, and then went on to Splinter Cell, Riddick, MGS and probably every other stealth-shooter ever made. So that’s a bit strange, crawling around in air ducts as The Batman. There are other strong design clichees at work in the game here and there. For some strange reason, some corridors in the Asylum look like they were stripped from a space-craft (you know, one of those typical video game space ship corridors, metallic surface, a bit rusty, hexagonal profile, leaky pipes along the walls). Maybe Mister Airducts brought his brother with him, who’s specialized in corridors.

But that’s just nitpicking, and a true hardcore gamer feels right at home with all the air-ducts and spacecraft corridors… The game itself plays really, really well, especially the slick hand-to-hand combat. There’s a lot to explore, sneaking, planning, nosing around in dark corners, just my thing. At times the game feels like Bioshock, and at other times a bit like Splinter Cell, but all in all this is the best implementation of the Batman universe I have ever seen, better then the movies anyway :)

16 Aug 2009

XNAMath

With all the focus on the console platforms I didn’t notice one very cool addition to the March DirectX SDK: XNAMath. This is basically the traditional Xbox360 vector math lib, ported to the PC with SSE2 and inlining support. The N3 math classes are now running from the same code base on top of XNAMath for the PC and Xbox360 platforms. Maik has spent a few days to analyze the generated code and after some tweaking the improvements for our simple math benchmarks are absolutely dramatic, up to 4x faster on the PC side!

We had to change our memory allocation routines on the PC to always return 16-byte aligned memory, without this, XNAMath isn’t really useful since the aligned load/store functions can’t be used on vectors residing in heap buffers. Really strange that there isn’t a way to do this through the Win32 heap functions directly (or is there?).

Other then that I’m currently deep into “jobifiying” the render thread, in order to free the PS3-PPU from the mundane number-crunching tasks. Properly jobified code will also “automatically” run about 2x faster on a 2-core PC, and about 3..4x faster on the Xbox360, since even single jobs will be split and processing will be distributed to worker threads. The actual speedup may even be higher, since the data must be re-organized into small independent chunks (“slices”) of about 16..32 kByte each in order to make the best use of the SPU local memory, and this improved spatial locality is also extremely beneficial for CPU caches on the other platforms (I think I’m starting to sound like a record, but I can’t stress enough how good this data-reorganization will be for N3 on ALL platforms :)

28 Jul 2009

Brain Rot

I think working on the PS3 is slowly destroying my higher brain functions. All I can think of at the moment are cycle counts, cache misses, memory latency, compiler intrinsics and synchronization issues. It’s becoming harder and harder to communicate with humans, let alone formulate a proper blog post. On one hand I welcome this fallback into nerd-dom, reminds me of the time when I learned programming by hacking hex-code into the 256 bytes of RAM of the LC-80. On the other hand it’s incredibly frustrating because everything takes so fucking long. Working for 3 days on a small problem which doesn’t even exist on other platforms isn’t fun. But ultimately the PS3 port will be incredibly good for Nebula3 because it forces me to think very hard about the data layout and data flows in the engine, and the PS3-port is now laying the foundation to scale beyond 2..3 cores on all future platforms.

So, quick status update before I turn into a zombie:

We’ll support FMOD in the future, integration into Nebula3 is currently underway. Only sensible way to do sound in multiplatform projects IMHO.
I have implemented Wolfgang Engel’s light pre-pass renderer in the PS3 port. I think we’ll use this also for the other platforms (well, except the Wii of course). Very cool stuff and solves a lot of problems on the fragment-shader level (jeez, I meant pixel-shader). I’ll write a bit more about this later.
I’m currently implementing a job system for Nebula3, designed around PS3 SPURS jobs. On other platforms, jobs will run in a thread-pool on the CPU. It will not be completely multiplatform, but it will be easy to create and maintain jobs for multiplatform projects (the only thing not multiplatform will be the actual job function, but this is usually just a few (maybe a few dozen) lines of code). Once I have profiling results I’ll write a blog post about the design and usage of the job system (it’s a bit different then what I described here).
The multiplayer component of N3 is currently being rewritten and ported to Xbox360 (and later PS3). The PC version (and probably PS3 version too) is based on RakNet, on the 360 we’re using an API from the XDK.
I have completely rewritten the whole StringAtom system for performance and memory footprint and have dropped the generalized Atom<> and Proxy<> classes (which the old StringAtom system was built on). The new system is hardwired for strings and nothing else.
Call Of Juarez BiB totally caught me by surprise. Best game so far this year on the 360 IMHO, and much more streamlined then the original.

That’s all for now. It will be very good to eventually crawl out of the cave into the sunlight and develop back into a normal human being again :)

27 Jun 2009

A more streamlined shader system

I’m currently porting N3’s shader system to the PS3, and while I got it basically working, I’m not satisfied because it’s not really possible to come up with a solution that’s really optimized for the PS3 without creating my own version of the DirectX FX system. Currently N3 depends too much on obscure FX features which actually aren’t really necessary for a realtime 3D engine. So this is a good time to think about trimming off a lot of fat from the shader classes in CoreGraphics.

I think the higher level state management parts are in pretty good shape. The FrameShader system divides a render-frame into render-passes, posteffect-passes and render-batches, draw-calls are grouped by shader-instance. All together these do a good job of preventing redundant state switches. So above the CoreGraphics subsystem (almost) everything is fine.

Here’s how shaders are currently used in N3:

FramePass: a pass-shader sets render states which are valid throughout the whole pass (for instance, a depth-pass may disable color-writes, and so on…), but doesn’t contain vertex- or pixel-shaders
FrameBatch: this is one level below passes, a batch-shader sets render state which is valid for rendering a batch of ModelNodeInstances with the same ModelNodeType (for instance, a render batch which contains all SrcAlpha/InvSrcAlpha-blended objects would configure the alpha blending render states accordingly). FrameBatches usually don’t have a vertex/pixel-shader assigned, but there are cases where this is useful, if everything that is rendered during the batch uses the same shader anyway.
PostEffects: a post-effect-pass uses a shader to render a fullscreen-quad
StateNodes and StateNodeInstances: node-instances are always rendered in batches, so that material parameters only need to be set once at the start of a node-batch, other parameters (most notably the ModelViewProjection matrix) must be set before each draw-call.
Specific subsystems may use shaders for their own rendering as they please, but let’s just say that those will have to live with the limitations of the new shader system ;)

This defines a pretty small set of requirements for a refactored Nebula3 shader system. By removing feature requirements and making the system *less* flexible, we have a better foundation for low-level optimizations and especially for porting the code to other platforms without compromises.

Here’s a quick overview of the current shader system:

The important classes and concepts to understand N3’s shader system are Shader, ShaderInstance, ShaderVariation and ShaderVariable. The basic idea of the shader system is to move all the slow stuff into the setup phase (for instance looking up shader variables) instead of doing it every frame.

A Shader encapsulates an actual shader file loaded from disc. Shaders are created once at application start, one Shader object per shader file in the shd: directory. Shaders are not actually used for anything except providing a template for ShaderInstances.

A ShaderInstance is created from a specific shader and contains its own set of shader parameter values. For instance, every Models::StateNode object creates its own ShaderInstance and initializes it with its specific material parameters.

A ShaderVariable is created from a ShaderInstance and is used to set the value of a specific shader parameter of the ShaderInstance.

ShaderVariations are the most interesting part of the system. Under the hood, a ShaderVariation is an FX Technique, and thus can have a completely different set of render states and vertex/pixel shader code. The variation that is currently used for rendering is selected by a “feature bit mask”, and the bits of the feature mask can be set and deleted in different parts of the rendering pipeline. For instance, the Depth Pass may set the “Depth” feature bit, and a character rendering node may set the “Skinned” feature bit, resulting in the feature mask “Depth|Skinned”. A Shader can contain any number of variations, each with a specific feature mask. Thus a shader which supports rendering in the depth pass, and also supports skinning would implement (among others) a variation “Depth|Skinned” for rendering skinned geometry during the depth pass.

The biggest problem with the current system is that ShaderVariables are so damn powerful, thanks to the FX system. FX implicitly caches the current value of shader variables, so that a variable can be set at any time before or after the shader is actually used for rendering. FX also tracks changes to shader variables and knows which shader variables need to be updated when it comes to rendering. This extra caching layer is actually not really required by Nebula3 and removing it allows for a much more streamlined implementation if FX is not available or not desirable.

Here are the changes I currently have in mind:

A lot of reflection information will not be available. A Shader object won’t be required to enumerate the shader parameters and query their type. The only information that will probably survive is whether a specific parameter exists, but not its datatype, array size, etc… This is not really a problem for a game application, since the reflection information is usually only needed in DCC tools.
Querying the current value of a shader parameter won’t be possible (in fact that’s already not possible in N3, even though FX offers this feature).
ShaderInstances will offer a way to setup shader parameters as “immutable”. Immutable parameters are set once but cannot be changed afterwards. For instance, the material parameters of a StateNode won’t be changed once they are initialized, and setting them up as immutable allows the shader to record them into some sort of opaque command buffer or state block and then completely forget about them (so it doesn’t have to keep track of immutable parameters at all once they have been setup).
Mutable shader parameters (like the ModelViewProjection matrix) are not required to cache their current value, this means that N3 may only set mutable shader parameters after the ShaderInstance has been set as the active shader. This may be a bit inconvenient here and there, but it relieves the low-level shader system from a lot of house-keeping.
With the 2 previous points a ShaderInstance doesn’t have to keep track of shader parameters at all. All the immutables are stored away in command buffers, and all the mutables could directly go through to the rendering API.
It won’t be possible to change rasterization state through shader variables at all (like CullMode or AlphaRef - basically, everything that’s set inside a FX Technique), shader variables may only map to vertex- or pixel-shader uniform constants.
N3 still needs to call a Commit() method on the shader instance after setting mutable parameters and before issuing the draw-call since depending on the platform, some housekeeping and batching may still be necessary for setting mutable parameters efficiently.
Shared shader parameters will (probably) go away. They are convenient but hard to replicate on others platforms. Maybe a similar mechanism will be implemented through a new N3 class so that platform ports have more freedom in implementing shared shader parameters (shared parameters will likely have to be registered by N3 instead of being automatically setup by the platform’s rendering API).
The last fix will be in the higher-level code where ModelNodeInstances are rendered:

Currently, draw-calls are grouped by their ShaderInstance, but those instance-groups themselves are not sorted in a particular order. In the future, rendering will also be sorted by Shader and active ShaderVariation, this allows to change the vertex and pixel shaders less frequently (changing the pixel shader is particularly bad).
Currently, a ShaderInstance object contains a complete cloned FX effect instance, in the future, a shader instance will not use a cloned effect - instead, immutable parameters will be captured into a parameter block and mutable parameter changes will be routed directly to the “parent” FX effect.

Another mid-term-goal I want to aim for is to reduce the number of per-draw-call shader-parameter-changes as much as possible. There are a lot of lighting parameters fed into the pixel shader for each draw call currently. Whether this means going to some sort of deferred rendering approach or maybe setting up per-model lighting parameters in a dynamic texture, or trying to move more work back into the vertex shader I don’t know yet.

20 Jun 2009

N3 I/O Tips & Tricks

Note: all of the following code-fragments assume:

using namespace Util;
using namespace IO;

Working with Assigns

Assigns are path aliases which are used instead of hardwired file locations. This lets an N3 application use filenames which are independent from the host platform, actual Windows version, or Windows language version. The following “system assigns” are pre-defined in a standard Nebula3 application:

home: Points to the app’s installation directory, or (on console platforms) the root directory of the game content. The home: location should always be treated as read-only!
bin: Points to the location where the app’s executable resides, should be treated as read-only (not available on console platforms)
user: On Windows, points to the logged-in user’s data directory (e.g. on Windows 7 this is c:\Users\[user]\Documents). Can be treated as read/write, and this is where profile-data and save-game-files should be saved. On consoles this assign may point to the save-location, or it may not be available at all (if saving game data is not handled through some sort of file system on that specific platform).
temp: On Windows, this points to the logged-in user’s temp directory (e.g. on Windows 7 this is c:\Users\[user]\AppData\Local\Temp). This directory is read/write but applications should assume that files in this directory may disappear at any time. This assign is not available on console platforms.
programs: On Windows, this points to the standard location for programs (e.g. “c:\Program Files”)
appdata: On Windows, this points to the user’s AppData directory (e.g. c:\Users\[user]\AppData)

Additionally to these system assigns, Nebula3 sets up the following “content assigns” at startup which all applications can rely on:

export: points to the root of the directory where all game data files reside
- ani: root directory of animation files
- data: root directory of general data files
- video: root directory of movie files
- seq: root directory of sequence files (e.g. engine-cutscenes)
- stream: root directory of streaming audio files
- tex: root directory of texture files
- frame: root directory of frame shader files
- mdl: root directory of .n3 files
- shd: root directory of shader files
- audio: root directory for non-streaming audio files
- sui: root directory for “Simple GUI” scene files

More standard assigns may be added in the future.

An application can define its own assigns or override existing assigns using the IO::AssignRegistry singleton:

AssignRegistry::Instance()->SetAssign(Assign(“bla”, “home:blub”));

To use the new assign, simply put it at the beginning of a typical file path:

“bla:readme.txt”

This would resolve to the following absolute filename (assuming your app is called “MyApp” and located under the standard location for programs under Windows):

“C:\Program Files\MyApp\blub\readme.txt”

You can resolve a path name with assigns into an absolute file name through the AssignRegistry, this is usually necessary when working with 3rd party libs:

String absPath = AssignRegistry::Instance()->ResolveAssignsInString(“bla:readme.txt”);

Finally, Assigns are not restricted to file system locations:

AssignRegistry::Instance()->SetAssign(Assign(“bla”, “http://www.radonlabs.de/blub”));

Listing directory content

You can list the files or subdirectories of a directory with pattern matching like this:

// list all files in the app’s export directory
Array<String> files = IoServer::Instance()->ListFiles(“home:export”, “*”);

// list all subdirectories in the app’s export directory
Array<String> dirs = IoServer::Instance()->ListDirectories(“home:export”);

// list all DDS textures of category “leafs”
Array<String> leafTextures = IoServer::Instance()->ListFiles(“tex:leafs”, “*.dds”);

Note that the returned strings are not full pathnames, only the actul file- and directory-names!

Working with directories

You can create directories and subdirectories with a single method call:

IoServer::Instance()->CreateDirectory(“home:bla/blub/blob”);

This will also create all missing subdirectories as needed.

You can delete the tail directory of a path, but the directory must be empty:

IoServer::Instance()->DeleteDirectory(“home:bla/blub/blob”);

This will delete the “blob” subdirectory only, and only if there are no files or subdirectories left under “blob”.

You can check whether a directory exists:

if (IoServer::Instance()->DirectoryExists(“home:bla/blub”))
{
// directory exists
}

NOTE:

creating and deleting directories in archive files doesn’t work
all directory functions only work in the file system, so a DirectoryExists(“http://www.radonlabs.de/bla”) will *not* work

Working with files

The following IoServer methods are available for files, these all work directly with path names, so you don’t need to have an actual Stream object around:

// check whether a file exists:
if (IoServer::Instance()->FileExists(“home:readme.txt”)) …

// delete a file:
IoServer::Instance()->DeleteFile(“home:readme.txt”);

// copy a file:
IoServer::Instance()->CopyFile(“home:src.txt”, “home:dst.txt”);

// check if the read-only flag is set on a file:
if (IoServer::Instance()->IsReadOnly(“home:src.txt”)) …

// set the read-only flag on a file:
IoServer::Instance()->SetReadOnly(“home:src.txt”);

// getting the last modification time of a file:
FileTime fileTime = IoServer::Instance()->GetFileWriteTime(“home:readme.txt”);

// setting the last modification time of a file:
IoServer::Instance()->SetFileWriteTime(“home:readme.txt”, fileTime);

NOTE:

DeleteFile(), SetReadOnly(), SetFileWriteTime() do not work in file archives
CopyFile() doesn not work if the destination is located in a file archive
all of these functions only work with file system paths (not “http://…”)

How to get the size of a file

Currently, the only way to query the size of a file is through an open IO::Stream object. This may change in the future though.

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:readme.txt”);
if (stream->Open())
{
Stream::Size fileSize = stream->GetSize();
stream->Close();
}

Working with file archives

You can mount zip archives as an overlay over the actual file system:

IoServer::Instance()->MountArchive(“home:archive.zip”);

All file system accesses will first check mounted archives (in mount order) before falling back to the actual file system.

Archives have the following restrictions:

writing to archives is not supported
the archive filesystem will keep some sort of table-of-content in memory as long as an archive is mounted, the actually required size of the TOC differs by platform.
in the current zlib-based implementation, complete files are decompressed into memory, thus opening a 100 MB file from an archive will also allocate 100 MB of memory until the file is closed
for the above reason, streaming from archive files doesn’t make sense, thus things like streaming audio wave banks or movie files should not be placed into archive files

On console platforms platform-native file archive formats are used if available (e.g. .PSARC on PS3 or .ARC on Wii) which usually have less restrictions and are better optimized for the platform then plain ZIP support. The Xbox360 port currently uses the standard zlib implementation but this will very likely change in the future.

On Windows (and currently Xbox360), zip support is handled through zlib.

You can add support for other archive formats by deriving subclasses from the classes under foundation/io/archfs, but currently it is not possible to mix different archive formats in one application (because you need to decide on a specific archive-filesystem-implementation at compile-time).

You can turn off the archive file-system layer completely through IoServer::SetArchiveFileSystemEnabled(false). All file accesses will then go directly into the actual file system. This is useful for tools which need to make sure that they don’t accidently read data from an archive file.

Nebula3 defines a “standard archive” where all game data is located. The data in the archive is located under the “export:” assign on all platforms. The actual archive filenames for the various platforms are:

Win32: home:export_win32.zip
Xbox360: home:export_xbox360.zip
Wii: home:export_wii.arc
PS3: home:export_ps3.psarc

The archiver3.exe tool takes care about generating those standard archives as part of the build process, when generating data for console platforms, the actual console SDK must be installed

(however, please note that we cannot currently license the N3 console ports to other companies anyway).

Working with the SchemeRegistry

The IO::SchemeRegistry singleton associates URI schemes (those things at the start of an URI, e.g. “file://…”, “http://…”) with Stream classes. You can override the pre-defined scheme associations or register your own scheme with a stream class of your own:

// register my own scheme and stream class:
SchemeRegistry::Instance()->RegisterUriScheme(“myscheme”, MyStream::RTTI);

// create a stream object by URI:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“myscheme://bla/blub”);

// the returned stream object should be an instance of our derived class:
n_assert(stream->IsInstanceOf(MyStream::RTTI));

You can also override standard associations to route all file accesses through your own stream class like this:

// override the file scheme to use our own stream class:
SchemeRegistry::Instance()->RegisterUriScheme(“file”, MyStream::RTTI);

Reading and writing XML files

Attach an IO::XmlReader object to an IO::Stream object to parse the content of an XML file. The XmlReader can access nodes through path names, so you can navigate XML nodes like files in a file system. The XmlReader tracks a “current node” internally, like the “current directory” in a file system API.

// create an XML reader and parse the file “home:test.xml”:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:test.xml”);
Ptr<XmlReader> xmlReader = XmlReader::Create();
xmlReader->SetStream(stream);
if (xmlReader->Open())
{
    // test if a specific node exists in the XML file:
    if (xmlReader->HasNode(“/Nebula3/Models”))…

    // position the current node on “/Nebula3/Models”:
    xmlReader->SetToNode(“/Nebula3/Models”);

    // iterate over child nodes of current node:
    if (xmlReader->SetToFirstChild()) do
    {
        …
    } while (xmlReader->SetToNextChild());

    // iterate over child nodes named “Model”:
    if (xmlReader->SetToFirstChild(“Model”)) do
    {
        …
    } while (xmlReader->SetToNextChild(“Model”));

    // test if the current node has a “name” attribute
    // and read its value as a string:
    if (xmlReader->HasAttr(“name”))
    {
        String name = xmlReader->GetString(“name”);
    }

    // if the “name” is optional, you can also do this in one line of
    // code and provide a default value, if “name” is not present:
    String name = xmlReader->GetOptString(“name”, “DefaultName”);

    // you can also read simple data types directly:
    int intVal = xmlReader->GetInt(“intAttr”);
    float floatVal = xmlReader->GetFloat(“floatAttr”);
    …

    // to read the current node’s content (<Node>Content</Node>):
    if (xmlReader->HasContent())
    {
        String content = xmlReader->GetContent();
    }

To create a new XML file, use an XmlWriter in a similar fashion:

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“temp:bla.xml”);
Ptr<XmlWriter> xmlWriter = XmlWriter::Create();
xmlWriter->SetStream(stream);
if (xmlWriter->Open())
{
    // write a node hierarchy, and add a few attributes to the leaf node:
    xmlWriter->BeginNode(“Nebula3”);
      xmlWriter->BeginNode(“Models”);
        xmlWriter->BeginNode(“Model”);
          xmlWriter->SetString(“name”, “A Model”);
          xmlWriter->SetInt(“intVal”, 20);
          …
        xmlWriter->EndNode();
      xmlWriter->EndNode();
    xmlWriter->EndNode();

    // close xml writer, this will also close the stream object
    xmlWriter->Close();
}

// you can also write a comment to the XML file:
xmlWriter->WriteComment(“A Comment”);

// or write the content enclosed by the current node:
xmlWriter->WriteContent(“Content”);

Additional notes on XML processing:

XmlReader and XmlWriter use TinyXml internally which has a tiny modification to read and write data through Nebula3 stream objects instead of the host filesystem

Reading large XML files can be very slow because of the thousands of small allocations going on for string data, thus reading XML files is not recommended for actual game applications, use optimized binary formats, or Nebula3’s database subsystem instead.

There’s currently no easy way to read an XML file, modify it and write it back.

Working with BinaryReader / BinaryWriter

The IO::BinaryReader and IO::BinaryWriter classes implement access to streams as a sequence of simple typed data elements like int, float, float4 or strings with automatic byte order conversion for different platforms:

// read data from a file using BinaryReader:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:bla.bin”);
Ptr<BinaryReader> reader = BinaryReader::Create();
reader->SetStream(stream);
if (reader->Open())
{
    uchar ucharVal = binaryReader->ReadUChar();
    float floatVal = binaryReader->ReadFloat();
    String strVal = binaryReader->ReadString();
    Math::matrix44 matrixVal = binaryReader->ReadMatrix44();
    Blob blob = binaryReader->ReadBlob();
    …
    reader->Close();
}

// writing data is just the other way around:
Ptr<Stream> stream = IoServer::Instance()->CreateStream(“temp:bla.bin”);
Ptr<BinaryWriter> writer = BinaryWriter::Create();
writer->SetStream(stream);
if (writer->Open())
{
    writer->WriteUShort(123);
    writer->WriteString(“Bla”);
    …
    writer->Close();
}

The byte order of BinaryReader/Writer is by default set to ByteOrder::Host (the host platform’s native byte order). You can enable automatic byte order conversion with the SetStreamByteOrder() method on BinaryReader and BinaryWriter. For instance, to create a binary file for one of the PowerPC-driven consoles from a tool running on the PC you would setup the BinaryWriter like this:

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“temp:bla.bin”);
Ptr<BinaryWriter> writer = BinaryWriter::Create();
writer->SetStream(stream);
writer->SetStreamByteOrder(System::ByteOrder::BigEndian);
if (writer->Open()) …

Automatic byte order conversion naturally doesn’t work for Util::Blob objects, since the reader/writer doesn’t know how the data inside the blob is structured.

Additional Notes:

Only use BinaryReader / BinaryWriter in an actual game project when absolutely necessary. For the sake of efficient and fast loading from disk it’s usually better to prepare any sort of game data as a native memory dump which can be loaded with a simple Stream::Read() and immediately used without any parsing or data conversions going on during load. Time spent in the build pipeline is a thousand times cheaper then time spent waiting for a level to load, thus BinaryReader and BinaryWriter are much better used in offline tools!

Even in offline tools, BinaryReader/Writer can be very slow when processing thousands of data elements, since reading or writing each little data element will cause a complete round-trip through the ReadFile / WriteFile functions of the host’s operating system. Use the SetMemoryMappingEnabled(true) method to speed up reading and writing of data elements drastically by caching the data in memory. In the BinaryReader, this will load the entire file into memory in Open(), and in the BinaryWriter, all writes will go into a memory buffer first, which will then be dumped to a file in Close() with a single Write().

Reading Excel XML files with Nebula3

You can use the IO::ExcelXmlReader stream reader class to parse files saved in XML format from MS Excel (all versions should work, but when in doubt, save as Excel 2003 XML file). After opening an Excel file, the content of the file can be accessed by table, column and row index:

Ptr<Stream> stream = IoServer::Instance()->CreateStream(“home:excelsheet.xml”);
Ptr<ExcelXmlReader> reader = ExcelXmlReader::Create();
reader->SetStream(stream);
if (reader->Open())
{
    // NOTE: when working with the left-most “default” table,
    // we can simply omit the table index in all methods.
    //
    // Test if a column exists in the left-most table:
    if (xmlReader->HasColumn(“Bla”))
    {
        // get the column index (returns InvalidIndex if not exists)
        IndexT colIndex = xmlReader->FindColumnIndex(“Bla”);

        // iterate over all rows and read content of column “Bla”:
        IndexT rowIndex;
        SizeT numRows = xmlReader->GetNumRows();
        for (rowIndex = 0; rowIndex < numRows; rowIndex++)
        {
            // get the content of cell at (rowIndex,colIndex):
            String elm = xmlReader->GetElement(rowIndex, colIndex);
        }
    }
}

You can also access specific tables in an Excel file. In this case, an additional tableIndex parameter is used in most methods:

// get the number of tables in the Excel file:
SizeT numTables = xmlReader->GetNumTables();

// find a table index by name (return InvalidIndex if not exists)
IndexT tableIndex = xmlReader->GetTableIndex(“TableName”);
// get the name of table at index
const String& tableName = xmlReader->GetTableName(tableIndex);
// to access a particular table, simply add the table index as last
// parameter to all other methods:
SizeT numRowsOfTable = xmlReader->GetNumRows(tableIndex);
SizeT numColsOfTable = xmlReader->GetNumColumns(tableIndex);
bool hasColumn = xmlReader->HasColumn(“Bla”, tableIndex);
String elm = xmlReader->GetElement(rowIndex, colIndex, tableIndex);
…

Note that Excel XML files come with a lot more fluff then actual data, so reading an Excel table bigger then a few dozen or hundred kBytes is EXTREMELY slow. Do not use Excel XML files directly in your game application, but only as source data files in the content pipeline, and convert them to binary files or write them into an SQLite database during the build process for more efficient consumption by the game application.

How to register you own ConsoleHandler

Console handlers are attached to the IO::Console singleton to handle console output and optionally provide text input from a console. When Nebula3 starts up, a standard console handler is created in IO::Console::Open() which (under Windows) will route output to STD_OUTPUT_HANDLE and STD_ERROR_HANDLE, and provide non-blocking input from STD_INPUT_HANDLE. When not compiled with the PUBLIC_BUILD define, all text output will also go to OutputDebugString() and thus show up in the debugger, which is especially useful when running a windowed application (as opposed to a console application) under Windows.

On console platforms, the default ConsoleHandlers route all text output to the debug out channel.

Nebula3 provides 2 optional console handlers:

IO::LogFileConsoleHandler: can be used to capture all text output into a log file

IO::HistoryConsoleHander: captures console output into an in-memory ring buffer, this is currently used by the IO::ConsolePageHandler for the Debug-HTTP-Server in order to provide a snapshot of console output in a web browser connected to the N3 application.

Registering your own console handler is easy: just derive a subclass from IO::ConsoleHandler, override a few virtual methods, and call IO::Console::AttachHandler() with a pointer to an instance of your derived class early in your application.

Phew, I think that’s all the stuff that’s good-to-know when doing file IO in Nebula3. Please note that this was all about synchronous I/O. For asynchronous IO the IO::IoInterface interface-singleton is used, this simply launches a thread with its own thread-local IoServer, and IO operations are sent to the IO thread using Message objects (which in turn need to be checked for completion of the IO operation). It’s not much more complicated then synchronous IO. However I’m planning to do a few changes under the hood for asynchronous IO to enable better platform-specific optimizations in the future.

16 Jun 2009

Nebula3 RTTI Tips & Tricks

Note: I have omitted the namespace prefixes and ‘using namespace xxx’ statements in the code below to improve readability. Also, since I didn’t run the code through a compiler, there may be a lot of typos.

Don’t be confused by Rtti vs. RTTI:

Rtti is the class name, MyClass::RTTI is the name of the Rtti-object of a class. Every RefCounted derived class has exactly one static instance of Core::Rtti, which is initialized before main() is entered.

Check whether an object is an instance of a specific, or of a derived class:

This is the standard feature of the Nebula3 RTTI system, checking whether an object can safely be cast to a specific class interface:

// check whether obj is instance of a specific class:
if (obj->IsInstanceOf(MyClass::RTTI)) …

// check whether obj is instance of class, or a derived class:
if (obj->IsA(MyClass::RTTI))…

Compared to Nebula2, N3’s RTTI check is extremely fast (in N2, it was necessary to convert a class name string into a pointer first before doing the check). In N3, RTTI checks are simple pointer comparisons. The IsA() method is a bit slower if the class doesn’t match since it needs to walk the inheritance hierarchy towards the root. Because of this it is always better to use the IsInstanceOf() method if possible, since this is always only a single pointer comparison.

Both methods also exist as class-name and class-fourcc versions, but of course these are both slower then the methods which directly work with the RTTI objects:

if (obj->IsInstanceOf(“MyNamespace::MyClass”)) …
if (obj->IsInstanceOf(FourCC(‘MYCL’))…

if (obj->IsA(“MyNamespace::MyClass”))…
if (obj->IsA(FourCC(‘MYCL’))…

Using Ptr<> cast methods for safe-casting:

The Ptr<> class comes with 3 cast methods, 2 for safe up- and down-casts, and one unsafe-but-fast C-style cast. To do a down-cast (from a general parent class down to a specialized sub-class) you can do this:

// assume that res is a Ptr<Resource>, and safely down-cast
// it to a Ptr<D3D9Texture> (D3D9Texture is a subclass of Resource):
const Ptr<D3D9Texture>& d3d9Tex = res.downcast<D3D9Texture>();

This will generate a runtime-error if tex is not a D3D9Texture object.

Safely casting upwards in the inheritance hierarchy works as well:

const Ptr<Resource>& res = d3d9Tex.upcast<Resource>();

An unsafe C-style cast is done like this:

const Ptr<Resource>& res = d3d9Tex.cast<Resource>();

An unsafe cast is the fastest (in release mode, the compiler should optimize the method call into nothing), but of course it also makes it extremely easy to shoot yourself in the foot. The 2 safe-cast methods call the Rtti::IsDerivedFrom() method, no temporary Ptr<> object will be created since they return a const-ref.

Query RTTI objects directly:

You can directly query many class properties without having an actual object of the class around:

// get the name of a class:
const String& className = MyClass::RTTI.GetName();

// get the FourCC identifier of aclass:
FourCC classFourCC = MyClass::RTTI.GetFourCC();

// get a pointer to the Rtti object of the parent class
// (returns 0 when called on RefCounted::RTTI)
Rtti* parentRtti = MyClass::RTTI.GetParent();

// check if a class is derived from this class:
// by Rtti object:
if (MyClass::RTTI.IsDerivedFrom(OtherClass::RTTI)) …
// by class name:
if (MyClass::RTTI.IsDerivedFrom(“MyNamespace::OtherClass”)) …
// by class fourcc:
if (MyClass::RTTI.IsDerivedFrom(FourCC(‘OTHR’))…

You can check two Rtti objects for equality or inequality:

const Rtti& otherRtti = …;
if (MyClass::RTTI == otherRtti)…
if (MyClass::RTTI != otherRtti)…

Since it is guaranteed that only one Rtti object exists per class this is equivalent with comparing the addresses of 2 Rtti objects (and that’s in fact what the equality and inequality operators do internally).

Create objects directly through the RTTI object:

Ptr<MyClass> myObj = (MyClass*) MyClass::RTTI.Create();

The old-school C-style cast looks a bit out of place but is currently necessary because the Rtti::Create() method returns a raw pointer, not a smart-pointer.

Creating an object through the RTTI object instead of the static MyClass::Create() method is useful if you want to hand the type of an object as an argument to a method call like this:

Ptr<RefCounted> CreateObjectOfAnyClass(const Rtti& rtti)
{
return rtti.Create();
}

This is a lot faster then the 2 other alternatives, creating the object through its class name or class fourcc identifier.

Create objects by class name or FourCC identifier

You can use the Core::Factory singleton to create RefCounted-derived objects by class name or by a FourCC identifier:

Ptr<MyClass> obj = (MyClass*) Factory::Instance()->Create(“MyNamespace::MyClass”);

Ptr<MyClass> obj = (MyClass*) Factory::Instance()->Create(FourCC(‘MYCL’));

This is mainly useful for serialization code, or if the type of an object must be communicated over a network connection.

Lookup the RTTI object of a class through the Core::Factory singleton

You can get a pointer to the static RTTI object of a class by class name or class FourCC identifier:

const Rtti* rtti = Factory::Instance()->GetClassRtti(“MyNamespace::MyClass”);

const Rtti* rtti = Factory::Instance()->GetClassRtti(FourCC(‘MYCL’));

This will fail hard if the class doesn’t exist, you can check whether a class has been registered with the factory using the ClassExists() methods:

bool classExists = Factory::Instance()->ClassExists(“MyNamespace::MyClass”);

bool classExists = Factory::Instance()->ClassExists(FourCC(‘MYCL’));

Troubleshooting

There are 2 common problems with Nebula3’s RTTI system.

When writing a new class, it may happen that the FourCC code of the class is already taken. In this case, an error dialog will popup at application start which looks like this:

To fix this collision, change the FourCC code of one of the affected classes and recompile.

The other problem is that a class doesn’t register at application startup because the constructor of its static RTTI object has been “optimized away” by the linker. This happens when there’s no actual C++ code in the application which directly uses this class. This is the case if an object is created through N3’s create-by-class-name or create-by-class-fourcc mechanism and the class is only accessed indirectly through virtual method calls.

In this case the linker will drop the .obj module of this class completely since there are no calls from the outside into the object module. That’s a neat optimization to keep the executable size small, and it works great with the static object model of plain C++, but with Nebula3’s dynamic object model we need to trick the linker into linking “unused” classes into the executable. We don’t have to do this for every RefCounted-derived class fortunately, only for specific parts of the inheritance hierarchy (for instance subclasses of ModelNode and ModelNodeInstance in the Render layer, or subclasses of Property in the Application layer)

To prevent the linker from dropping a class the following procedure is recommended:

add a __RegisterClass(MyClass) macro to a central .h ‘class registry’ header file
include this header file into a .cc file which definitely won’t be dropped by the linker

The header file /nebula3/code/render/render_classregistry.h is an example for such a centralized class registry header.

6 Jun 2009

Tidbits

I must admit that I’m completely hyped for the Xbox team’s new “interaction system” Project Natal, especially since this guy has been part of the team (Microsoft sure knows how to grab the relevant people). One has to look through the silly demos though (using a car racing game to demonstrate “controller free gaming” is kind of stupid when in the real world you’re sitting *inside* a huge freaking controller, which has a much more complex interface then any input device used for gaming and requires extremely skillful use of both hands and feet…).

It’s fascinating to think about how much more information a game suddenly has about the world on the other side of the wall. With mouse/keyboard, a game pad, or even a Wiimote, the button presses, pointer positions, analog sliders and acceleration vectors only provide a handful of values, a few hundred bits of information maybe. That’s like a few tiny lights in a vast dark ocean. But if the currently available infos are somewhat correct, Natal provides per-pixel color and depth-information as well as a microphone array for spotting sound sources, everything pre-processed in meaningful ways by the software (that’s the important part which differentiates it from something simple like a web cam). That’s like a sunrise in the dark ocean! The game is suddenly able to see and hear what’s happening in that strange parallel universe called “real world”, and the software (which is hopefully provided with the 360 SDK) is the brain which interprets that information. That’s another great thing, all the interesting stuff happens in the software and can be improved with a simple software update, instead of trying to sell another hardware addon like the Motion Plus to baffled players.

Now the only question is how to prevent walking into the Wii trap and not flood the 360 with stupid mini-games (IMHO mini-games are to game designers what the crates are for level designers: their last resort if they run out of ideas). It would be a terrible shame if this great technology is only used for Wii Play clones and silly “family games”. If Natal is rejected by the 360’s native hardcore crowd (yes I said the dirty word, there is no such thing as a “casual Xbox player”), it will soon fade into oblivion just as the current Vision Camera. I really doubt that Natal will cause casuals to forget about their Wii and convert to the 360 en mass. That ship has sailed long ago, maybe it could happen with the next console generation, or if Microsoft does a really big re-launch of a theoretical “Natal 360” which needs to look a lot slicker and sexier then the current 360. Any way, for the games audience, I think Microsoft should first try to sell Natal to the hardcore, and that’s not done with some silly ball catching or tennis demos, but with full-frontal science fiction shit like this: http://www.youtube.com/watch?v=UUuRqzVhJmA (although this is a conventional multi-touch device, not Natal, but you get the idea). For marketing to the casuals, concentrate on the 360 as a media device (now that was the right step in one of the demo videos, demonstrate how to watch a movie without having to the mess around with remote controls – because the one thing that’s much more complicated then a game pad is a remote control with dozens of tiny buttons).

I think the best way to design a Natal game is to pretend that gamepads, keyboard and pointing devices never existed, go back to the 60’s or even the 50’s and create some sort of alternate future without keyboards, mice and gamepads. This is hard and will take a lot of time, since our current gaming genres are the result of decades of joint evolution of game play and input methods. Simply putting a new input scheme over existing games won’t work very well (I think the Wii has shown this already). But on the other hand, the first thing that you’re telling a PC game designer or programmer is to forget that mouse/keyboard ever existed when he’s starting to work on a console game and fully embrace the platform’s controller as if it would be the only input device in the whole universe. Switching to this new state of mind can be incredibly hard and some will never be able to do it, but with an open mind this often works surprisingly well.

Some unrelated stuff:

Man, I hope 70 Euros isn’t becoming the new standard price for Xbox360 games in Germany (that’s about 100 US-$!). 60 Euros was already a shocker when the 360 launched. 70 Euros is waaayyy beyond impulse-buy-territory for me, especially when there’s only a few hours of entertainment in a game which is becoming more and more common. I’m currently extremely selective when buying new games, I think I only bought one or two new games this year, instead of almost one game per month in the past. I can’t imagine that this pricing strategy is good for the 360-platform in Germany, since people will only buy the very few big blockbusters and skip everything else, or get the cheaper (and most often better) UK versions.
Very cool news: Bethesda will offer the Fallout 3 DLC on discs! I have skipped all but the first addons for Fallout 3 because they’re only available in the German language version. With the disc release I can finally order the proper English version and don’t have to live with the poor German localization.
My wish from last year has been granted! Full game downloads on Xbox Live starting August 2009! Maybe I should consider a career as a fortune-teller or analyst hehe…
We got a couple of those slick new PS3 devkits (the new ones in standard PS3 cases, not those older 19” rack monsters) and have started the Nebula3 PS3 port already. We’re putting a proper team of 4 behind this so the circle will soon be complete :) I’m quite happy with the progress so far. The SDK is very complete, the tools are nice enough (pretty good VStudio integration, thank God). Not quite as cool as the 360 stuff, but that’s more a question of personal taste and probably not objective. The PS3 SDK seems to have made great strides since the PS3 launch, and I think many of the negativity on the internet surrounding the PS3 development stems from those early days (which we haven’t experienced ourselves).

That’s it for now, I’m going to nose around the PS3 SDK docs a bit now, perfect activity for a Saturday afternoon :)

-Floh.