30 Dec 2008

Games Of The Year

My personal GOTY's for 2008:
  • Fallout 3: I was very skeptical at first, but after 1.5 playthroughs (and counting) I must say that this is the best Bethesda game yet. In my opinion, the Fallout world fits the "Bethesda school of game design" much better then the phantasy world of the Elder Scrolls. And gameplay-wise, Fallout is quite a bit more streamlined then Oblivion, keeping the player engaged without giving up too much freedom.
  • Ninja Gaiden 2: Yeah, the game was rushed and there are some obvious flaws, but the combat in NG2 is simply spectacular. I must have played through the game eight or nine times, and I'm currently in the middle of my first Mentor run. I wish there would be hope for some sort of refined "NG2 Black", but with Itagaki's leave from Tecmo chances seem to be void :(
  • Dead Space: This game came out of nowhere for me, and in fact, it's hard to describe, why the game is so great. Dead Space is basically "Doom3, done right", and that's all there needs to be said. It's an old-school corridor shooter with shiny graphics and rock-solid shooting mechanics. And I can't get over the fact that this is an EA game. EA!!!
Live Arcade GOTYs: Castle Crashers, RezHD, BC: Rearmed and Omega Five (hmm, more great games on XBLA then retail this year...).

Here's my XBLA wishlist for 2009: downloadable full games, and more price variety for downloadable games (in my opinion, a game like Mirror's Edge would be better if it wouldn't have to fill the shoes of a full-blown 60.- Euro game).

See ya'all in 2009 :)

12 Dec 2008

Home (PSN)

I was just checking out Home, and to put it mildly, I was not impressed:
  • I was trying for about 15 minutes to get past the start screen, getting all types of connection errors, then suddenly... one connection attempt worked, as a "casual user" I would have given up after the first four or five errors (which do have intuitive names like "C931").
  • I tried for 10 minutes to create a character which doesn't look like shit and failed.
  • Downloads and long loading times between sections totally kills the "experience".
  • Everything basically looks and feels like it has been designed by yuppie marketeers stuck in the 90's.
I doubt I will ever return. Even if Home was "the cool shit", the long startup time kills every impulse to even launch that thing.

I don't know... I refuse to believe that Sony is driven by marketing peeps who thought that Second-Life is the next-big-thing during its short media-hype. I just don't "get" it. But who knows, I wasn't "getting" the Wii either, and look how that turned out (yeah, who am I kidding, it's more likely that Home will turn into a wasteland and Sony will let it die a quiet and slow death in a few months). Hmm... wasteland... time to return to Fallout3 :)

22 Nov 2008

Month of Shooters

I'm currently playing through the Gears 2 single player campaign, and I don't know why exactly, but I'm not having very much fun. There are a few easy to identify points which I clearly don't like:
  • Vehicle sequences: I don't know why FPS designers still torture us with vehicle sequences. The mini-tank in Gears 2 controls like shit (a bit like the Mass Effect Mako, but worse), and there's too much driving through boring tunnels with nothing else happening.
  • Stupid story: So far I'm not seeing the better story that was promised. Instead I'm just seeing Dom bringing up his wife all the time (just found her yesterday). What is this emotional bullshit doing in my Gears... really. If you want to do an emotional story, do it right. The Gears main character are simply not built for stuff like this. It's like asking Schwarzenegger to play Shakespeare. Some things simply don't work, no matter how hard one tries.
  • Dialog: alright, Marcus isn't exactly a second Cicero when it comes to conversational skills, but those stupid one-liners get old pretty fast.
  • Too much brown: I know it's cliche, but Gears could just as well be played on an old black-and-white monitor and not a lot of information would be lost. I'm sick of monochromatic shooters.
Except from the vehicle sequences (which are inexcusable) I can live with the other points. I like my bad story and cheesy dialogs, if there's a good game at the core.

But somehow, the core gameplay isn't very satisfying to me in Gears2. I've been playing into the COD4 single-player-campaign again for comparison, and holy shit this an entirely different level, even the "quiet moments" in COD4 are packed with action.

Gears2 brings up unbelievable stuff like mile-long worms, giant fish and whole cities sinking into the ground, but it doesn't grab me. Maybe it is because the scale is too big, or maybe it is because the actual fire fights aren't very exciting. I think maybe the main reason why the game leaves me cold is that there are too long pauses between fire fights, and fights are too predictable. Coming into an area with a lot of conveniently placed barricades? Sure as hell a few seconds later a door will shut behind you and Locust will start attacking.

Good thing Dead Space came along, otherwise I might have lost my faith in shooters ;)

I'm very impressed with the new Xbox dashboard, especially with the fast and painless update process (took maybe 2 minutes through my 16Mbit DSL line at home). I was prepared for the worst (something like 2 weeks without Xbox Live like last Christmas), but apart from a few glitches on the Marketplace which were fixed the next day everything went perfect. I love how responsive, fast and colorful everything is now, and the hard drive installation is a god-send because it turns off the jet-engine noise from the DVD drive.

I even like the Avatar stuff more then I should. I've been trying to create a Ron Jeremy avatar, but that's where the avatar creator really comes to its limits. Here's my wishlist for the next update:
  • the current fat setting needs to be a lot fatter, the current maximum isn't even enough to build a realistic Elvis in his later years
  • more 70's porn star accessories and clothing please
  • more real hair styles(!), there's plenty of 90's neo-hippie shit, but no hair which nearly does Ron Jeremy justice
  • hair styles should include chest and armpit options
Since I didn't find the right hair and clothes, I went for a Ron Jeremy/Princess Leia hybrid:

8 Nov 2008

CoreGamer

It feels like this is the most packed holiday season of all time. It is a shame that publishers don't spread their releases a bit more over the year. There was hardly a single good game coming out every quarter for the 360, and now suddenly since end of October it feels like there's a block buster released every day. Unfortunately, a lot of great games will be buried under the heavy weights like Gears or Fallout3, but well who am I to complain. It's a great time for hardcore gamers across all platforms, and that's all that counts :)

I think I'll have to work on the oncoming backlog until at least February next year, and careful scheduling is necessary to manage the avalanche of games indeed. Fortunately some of the games didn't turn out quite as good as pre-release-hype made me belief, so I can push a few games into next year. One of those is FarCry2. It's not a bad game by any means, I was prepared for the open-world-ness (which is the exact opposite of the original FarCry) and shitty to non-existent story (which is exactly in line with the original FarCry). But still I was disappointed. I must confess that I didn't give the game a real chance (only played for one evening, about 2 to 3 hours). Graphics are great, but game-play wise it is somewhere between Just Cause and Mercenaries2, and I already had my share of sandbox games this year I guess. However I will definitely come back to FC2 next year when the storm has settled a bit, but it looks like this will be the Assassin's Creed of 2008 (spectacular presentation, shallow gameplay).

Next up was Fable2. Great game (especially with the downloadable English audio track), but I just haven't the time to appreciate the game as a whole. The game requires a lot of time investment, but rewards the player for the time spent with a great sense of immersion. But at the moment I just don't have the attention span required for a game like this. One thing I found surprising was, that one of my old favorites, Overlord, in places looks better or at least very similar to Fable2 (not a surprise, since the Overlord designers definitely drew a lot of inspiration from the original Fable), but considering that Overlord is a 2007 title, Fable2 should have been a prettier game. I was about half way through Fable2, when along came my personal surprise hit of 2008 (so far at least):

Dead Space!

What a f*cking great game. It's a bit of Alien (the movie), a bit of System Shock, and a bit of Quake2, merged into a wonderfully old-school survival-horror-corridor-shooter. The really surprising bit is: this is an EA game! With the exception of Fight Night 3 and Skate, I wasn't interested in a single EA game for the entire life-time of the 360, and now all of the sudden, EA actually starts to produce great games in a row. This year alone I bought Battlefield BC, Mercenaries2 and Dead Space from EA, and will probably get Mirrors Edge soon. And considering that Bioware is now EA as well... oh dear. What has the world come to, EA making great games... The end must be near indeed hehe.

The PS2 is now at 77 Euro in Germany, just a tad more expensive then a typical 360 or PS3 game. That's the latest model: slim, with integrated power supply, one dualshock, composite cables, looking sexy as hell (no memory card though). Since my last PS2 went MIA, this was a very good reason to impulse-buy a new one. Better investment then my PS3 to be honest. Just playing a few minutes into MGS3 again was worth it.

CoreAnimation

Rewriting the animation system of Nebula2 was one of the big items on my personal to-do list for the last couple of months. Too many small features had been stacked on top of each other over time by too many programmers without that the underlying old code really supported the new stuff. The result was that the N2 animation system became a fragile and difficult to work with mess. It mostly does what it needs to do for Drakensang (which isn't trivial with its finely tuned animation queueing during comat), but it isn't elegant any more, and it is not really a joy to work with.

I have only been working very sporadically on the new N3 animation code during the past few months, restarting from scratch several times when I felt I was heading into a wrong direction. In the past couple of weeks I could finally work nearly full-time on the new animation and character subsystems, and not too early, because only then I really was satisfied with the overall design. The new animation system should fix all the little problems we encountered during the development of Drakensang and offer a few new features which we wished we had earlier. And it resides under a much cleaner and more intuitive interface then before (this was the main reason why I started over several times, finding class interfaces which encapsulate the new functionality, but are still simple to work with).

One of the earliest design decisions was to split the animation code into two separate subsystems. CoreAnimation is the low level system which offers high-performance, simple building blocks for a more complex higher level animation system. The high-level Animation subsystem sits on top of CoreAnimation and provides services like mapping abstract animation names to actual clip names, and an animation sequencer which allows easy control over complex animation blending scenarios.

The main focus of CoreAnimation is high performance for basic operations like sampling and mixing of animation data. CoreAnimation may contain platform-specific optimizations (although none of this has been implemented so far, everything in CoreAnimation currently works with the Nebula3 math library classes). CoreAnimation also assumes that motion-capturing is the primary source of animation data. Like sampled audio vs. MIDI, motion capture data consists of a large amount of animation keys placed at even intervals instead of a few manually placed keys at random positions in the time-line. The advantage is that working with that kind of animation data can be reduced to a few very simple stream operations, which are well suited for SSE, GPUs or Cell SPUs. The disadvantage to a spline-based animation system is of course: a lot more data.

Spline animation will be used in other parts of Nebula3, but support for this will likely go into a few simple Math lib classes, and not into its own subsystem.

Although not limited to, CoreAnimation assumes that skinned characters are the primary animation targets. This does not in any way limit the use of CoreAnimation for animating other types of target objects, but the overall design and optimizations "favours" the structure and size of animation data of a typical character (i.e. hundreds of clips, hundreds of animation curves per clip, and a few hundred to a few thousand animation keys per clip).

Another design limitation was, that the new animation system needs to work with existing data. The animation export code in the Maya plugin, and the further optimization and bundling during batch processing isn't exactly trivial, and although much of the code on the tools side would benefit from a cleanup as well (as is usually the case for most of the tools code in a production environment), I didn't feel like rewriting this stuff as well, especially since there's much more work in the tools-side of the animation system compared to the runtime-side.

So without further ado I present: the classes of CoreAnimation :)

  • AnimResource: The AnimResource class holds all the animation data which belongs to one target object (for instance, all the animation data for a character), that is, an array of AnimClip objects, and an AnimKeyBuffer. AnimResources are normal Nebula3 resource objects, and thus can be shared by ResourceId and can be loaded asynchronously.
  • StreamAnimationLoader: The StreamAnimationLoader is a standard stream loader subclass which initializes an AnimResource object from a data stream containing the animation data. Currently, only Nebula2 binary .nax files are accepted.
  • AnimKeyBuffer: This is where all the animation keys live for a single AnimResource. This is just a single memory block of float4 keys, no information exists in the key buffer how the keys relate to animation clips and curves. However, the animation exporter tools make sure that keys are arranged in a cache-friendly manner (keys are interleaved in memory, so that the keys required for a single sampling operation are close to each other in memory).
  • AnimClip: An AnimClip groups a set of AnimCurves under a common name (i.e. "walk", "run", "idle", etc...). Clip names are usually the lowest level component a Nebula3 application needs to care about when working with the animation subsystem. Clips have a number of properties and restrictions:
    • a human-readable name, this is stored and handed around as a StringAtom, so no copying of actual string data happens
    • a clip contains a number AnimCurves (for instance, a typical character animation clip has 3 curves per skeleton-joint, one for translation, rotation and scaling of each joint)
    • all anim curves in a clip must have the same key duration and number of keys
    • a clip has a duration (keyDuration * numKeys)
    • a pre-infinity-type and post-infinity-type defines how a clip is sampled when the sample time is outside of the clip's time range (clamp or cycle).
  • AnimCurve: An AnimCurve groups all the keys which describe the change of a 4D-value over a range of time. For instance, the animated translation of a single joint of a character skeleton in one clip is described by one animation curve in the clip. AnimCurves don't actually hold the animation keys, instead they just describe where the keys are located in the AnimKeyBuffer of the parent AnimResource. AnimCurves have the following properties:
    • Active/Inactive: an inactive AnimCurve is a curve which doesn't contribute to the final result, for instance, if an AnimClip only animates a part of a character skeleton (like the upper body), some curves in the clip are set to inactive. Inactive curves don't have any keys in the key buffer.
    • Static/Dynamic: an AnimCurve whose value doesn't change over time is marked as static by the exporter tool, and doesn't take up any space in the anim key buffer.
    • CurveType: this is a hint for the higher level animation code what type of data is contained in the animation curve, for instance, if an AnimCurve describes a rotation, the keys must be interpreted as quaternions, and sampling and mixing must use spherical operations.
  • Animation keys: There isn't any "AnimKey" class in the CoreAnimation system, instead, the atomic key data type is float4, which may be interpreted as a point, vector, quaternion or color in the higher level parts of the animation system. There is no support for scalar keys, since most animated data in a 3d engine is vector data, and vector processing hardware likes its data in 128 bit chunks anyway.
  • AnimEvent: Animation events are triggered when the "play cursor" passes over them. The same concept has been called "HotSpots" in Nebula2. AnimEvents haven't been implemented yet actually, but nethertheless they are essential for synchronizing all types of stuff with an animation (for instance, a walking animation should trigger events when a foot touches the ground, so that footstep sounds and dust particles can be created at the right time and position, and finally events are useful for synchronizing the start of a new animation with a currently playing animation (for instance, start the "turn left" animation clip when the current animation has the left foot on the ground, etc...).
  • AnimSampler: The AnimSampler class only has one static method called Sample(). It samples the animation data from a single AnimClip at a specific sampling time into a target AnimSampleBuffer. This is one of the 2 "front-end-features" provided by the CoreAnimation system (sampling and mixing). The AnimSampler is used by the higher level Animation subsystem.
  • AnimMixer: Like the AnimSampler class, the AnimMixer class only provides one simple static Method called Mix(). The method takes 2 AnimSampleBuffers and a lerp value (usually between 0 and 1) and mixes the samples from the 2 input buffers into the output buffer (k = k0 + l * (k1 - k0)). The AnimMixer is used for priority-blending of animation clips higher up in the Animation subsystem.
  • AnimSampleBuffer: The AnimSampleBuffer holds the resulting samples of the AnimSampler::Sample() method, and is used as input and output for the AnimMixer::Mix() method. An important difference from the AnimKeyBuffer is that the AnimSampleBuffer also has a separate "SampleCounts" array. This array keeps track of the number of the sampling operations which have accumulated for every sample while sampling and mixing animation clips into a final result. This is necessary for mixing partial clips correctly (clips which only influence a part of a character skeleton). The AnimSampler::Sample() method will set the sample count to 1 for each key which was sampled from an active animation curve, and to 0 for each inactive sample curve (which means, the actual sample value is invalid). Later when mixing 2 sample buffers the AnimMixer::Mix() method will look at the input sample counts, and will only perform a mixing operation if both input samples are valid. If one input sample is invalid, no mixing will take place, instead the other (valid) sample will be written directly to the result. If both input samples are invalid, the output sample will be invalid as well. Finally, the AnimMixer::Mix() method will set the output sample counts to the sum of the input sample counts, thus propagating the previous sample counts to the next mixing operation. Thus, if at the end of a complex sampling and mixing operation the sample count of a specific sample is zero, this means that no single animation clip contributed to that sample (which probably should be considered a bug).

That's it so far for the CoreAnimation subsystem, next up is the Animation subsystem which builds on top of CoreAnimation, and after that the new Character subsystem will be described, which in turn is built on top of the Animation system.

1 Oct 2008

Must Read

2 important and related blog posts:

Shader Workflow - Why Shader Generators are Bad
Graphical shader systems are bad

Good arguments why shaders should be treated as code under programmer control, not as graphics assets under artist control.

30 Sept 2008

Line-counting

Some line-count trivia:

Nebula3 Foundation Layer
66,502 lines
Nebula3 Render Layer
73,465 lines
Nebula3 Application Layer
22,706 lines
Nebula3 Addons
32,847 lines
Nebula3 All
195,520 lines
Nebula2
239,279 lines
Mangalore
181,592 lines
N2 + Mangalore
420,871 lines

The N3 line-count includes the current code for 3 platforms (Win32, Xbox360 and Wii), the N2+Mangalore count only includes one platform (Win32). Looks like N3 will end up a lot leaner then the old code which is a good thing :)

29 Sept 2008

Nebula3 September SDK

Here's the new SDK:

N3SDK_Sep2008.exe.

Please check my previous post for details, which got f*cked up pretty badly by importing it from Google Docs :(

27 Sept 2008

What's New in the September Nebula3 SDK


I finally got around to pack a new N3 SDK together. I'll upload it on Monday when I'm back in the office, in the meantime here's a rough What's New list. A lot of under-the-hood-stuff has changed, and I had to remove a few of the fancy front-end-features for now (for instance, the N2 character rendering had to be removed when I implemented the multi-threaded renderer, and the shader lighting code is broken at the moment). I'll care about this front-end stuff in the next release.

General Stuff


  • changes to enable mixing Nebula2 and Nebula3 code, mainly macro names are affected (DeclareClass -> __DeclareClass, ImplementSingleton -> __ImplementSingleton etc...)
  • started to remove #ifndef/#define/#endif include guards since pretty much all relevant compilers (VStudio, GCC, Codewarrior) support #pragma once 
  • moved identical Win32 and Xbox360 source code into a common Win360 namespace to eliminate code redundancies
  • added a new Toolkit Layer which contains helper classes and tools for asset export
  • added and fixed some Doxygen pages

Build System

  • re-organized VStudio solution structure, keeps all dependent projects in the same solution, so it's no longer necessary to have several VStudios open at the same time
  • it's possible now to import VStudio projects through the .epk build scripts (useful for actual Nebula3 projects which do not live under the Nebula3 SDK directory)
  • new "projectinfo.xml" file which defines project- and platform-specific attributes for the asset batch-export tools
  • split the export.zip archive into one platform-neutral and several platform-specific archives (export.zip contains all platform-independent files, export_win32.zip, export_xbox360.zip, export_wii.zip contain the platform-specific stuff)
  • added general multiplatform-support to the asset-pipeline (e.g. "msbuild /p:Platform=xbox360" to build Xbox360-assets)
  • new command-line build tools (with source):
    • audiobatcher3.exe (wraps audio export)
    • texturebatcher3.exe (wraps texture export)
    • shaderbatcher3.exe (wraps shader compilation)
    • buildresdict.exe (generates resource dictionary files)
    • these tools mostly just call other build tools (like xactbld3.exe, nvdxt.exe, or build tools for game-console SDKs)
  • note that the public N3-SDK only contains Win32 support for obvious legal reasons 

Foundation Layer

  • fixed thread-safety bugs in Core::RefCounted and Util::Proxy refcounting code
  • added WeakPtr<> class for better handling of cyclic references
  • added type-cast methods to Ptr<>
  • simplified the System::ByteOrder class interface
  • added platform-specific task-oriented "virtual CPU core id's" (e.g. MainThreadCode, RenderThreadCore, etc...)
  • added a System::SystemInfo class
  • added Threading::ThreadId type and static Threading::Thread::GetMyThreadId() method
  • proper thread names are now visible in the VStudio debugger and other debugging tools
  • SetThreadIdealProcessor() is now used to assign threads to avaible CPU cores on the Win32 platform
  • new HTTP debug page for the Threading subsystem (currently only lists the active Nebula3 threads)
  • MiniDump support: crashes, n_assert() and n_error() now write MiniDump files on the Win32 platform
  • new Debug subsystem for code profiling:
    • offers DebugTimer and DebugCounter objects
    • HTTP debug page allows to inspect DebugTimers and DebugCounters at runtime
  • new Memory::MemoryPool class for allocation of same-size memory blocks (speeds up allocation and reduces heap fragmentation)
  • some new and renamed methods in Math::matrix44
  • Http subsystem now runs in its own thread
  • added SVG support to Http subsystem (Http::SvgPageWriter and Http::SvgLineChartWriter)
  • added IO::ExcelXMLReader stream reader class, allows to read XML-formatted MS Excel spreadsheet files
  • added Behaviour mode to Messaging::AsyncPort, defining how the handler thread should wait for new messages:
    • WaitForMessage: block until message arrives
    • WaitForMessageOrTimeOut: block until message arrives or time-out is reached
    • DoNotWait: do not wait for messages
  • added Remote subsystem, allows remote-controlling N3 applications through a TCP/IP connection
Render Layer
  • moved rendering into its own thread (InternalGraphics subsystem on the render-thread side, and Graphics front-end subsystem on the main-thread side)
  • added CoreAnimation and Animation subsystems (under construction)
  • added UI subsystem for simple user interfaces (under construction)
  • added CoreAudio and Audio subsystems (under construction):
    • CoreAudio is the back-end and runs in its own thread
    • Audio is the "client-side" front-end in the main-thread (or any other thread)
    • designed around XACT concepts
    • comes with XACT wrapper implementation
  • added CoreGraphics::TextRenderer and CoreGraphics::ShapeRenderer classes, both intended for rendering debug visualizations
  • added debug rendering subsystem (currently under the Debug namespace)
  • Frame subsystem: FramePostEffects may now contain FrameBatches
  • Input subsystem: disconnected XInput game-pad slots now only check every 0.5 seconds for connected game-pads
  • Resources subsystem: added ResourceAllocator/ResourceLump system to prepare for true resource streaming on console-platforms
Application Layer and Addons:
  • removed CoreFeature (this stuff had to go into the GameApplication class to prevent some chicken-egg problems)
  • added NetworkFeature (under construction)
  • added UIFeature (under construction)
  • new CoreNetwork and Multiplayer addon wrapper subsystems for RakNet

Please note the special RakNet licensing conditions. Basically, RakNet is not free if used for a commercial project (http://www.jenkinssoftware.com/). Licensing details for 3rd party libs can be found on the Nebula3 documentation main page.

Stuff I want to do soon

  • fix the shader lighting code
  • add more shaders to bring the shader-lib up-to-par with N2
  • finish the CoreAnimation and Animation subsystems
  • design and implement proper skinned character rendering subsystem
  • add missing functionality to Audio subsystems (for instance sound categories) 
  • make shaders SAS compatible so they work with tools like FXComposer
  • implement a proper resource-streaming system on the 360 (as proof on concept)
  • optimize messaging (use delegate-mechanism for dispatching, optimize message object creation, add double buffering behaviour to AsyncPort for less thread-synchronization overhead)

20 Sept 2008

Adding functionality to threaded subsystems

Moving subsystems into their own thread introduces restrictions on how other threads can interact with the subsystem. It is no longer possible to simply invoke methods on objects running in the context of a threaded subsystem. The only way to interact with the subsystem is by sending messages to it. From a system design point-of-view this is a good thing. There's a very clear demarcation line defined by the message protocol to interact with the subsystem. It is pretty much impossible to invoke undocumented functionality from the outside and it is complicated to "accidently" use the subsystem's functionality in a way not intended by the subsystem's designer.

But of course those restrictions also have their dark side. All tasks which either require a lot of communication, or which require exact synchronization should better not be spread across threads. Although the messaging system is fast (and will remain an optimization hotspot) it is not free, it's not a good idea to send thousands (or even hundreds) of messages around per-frame. Also, a message sender should never wait for the completion of a message to work around the synchronization problem (at least not while the game loop is running), as this would pretty much nullify the advantage of running the subsystem in its own thread.

Nebula3 offers a relatively simple way to add functionality which shall run in the context of a subsystem thread. The basic idea is to create a new message-handler class (which is running in the subsystem's thread) and a new set of messages which can be processed by an instance of the new handler-class.

We recently did this to add debug-visualization capability to Nebula3. We wanted to have a simple way to (a) render debug text, and (b) render shapes (cubes, spheres, etc...) to make it simple to render debug-visualizations from anywhere in Nebula3.

The whole system is split into 3 parts:

  • The front-end classes running on the client-side (client-side means: every thread other then the render thread):
    • the Debug::DebugTextRenderer singleton offers text rendering
    • the Debug::DebugShapeRenderer singleton offers shape rendering
    • both are thread-local singletons, each thread which wants to render debug text or shapes needs to instantiate those
  • The back-end classes running in the render-thread:
    • CoreGraphics::TextRenderer
    • CoreGraphics::ShapeRenderer
    • these singletons implement the actual text- and shape-rendering functionality and are also platform-specific (under Windows, they use D3DX methods to do their jobs)
  • The communication components:
    • the Debug Render message protocol, this is a NIDL-XML-file (Nebula Interface Definition Language) which defines 2 messages: RenderDebugText and RenderDebugShapes
    • the DebugGraphicsHandler object, whose class is derived from Messaging::Handler, runs in the render thread, and processes the above 2 messages 

This is how the system works:

  1. the main thread instructs the GraphicsInterface singleton (which creates and manages the render-thread) to add a DebugGraphicsHandler object (that's at least how it SHOULD work,  at the moment, the GraphicsHandler simply creates and attaches a DebugGraphicsHandler on its own)
  2. client threads create one DebugTextRenderer and one DebugShapeRenderer singleton if they want to do debug visualization
  3. a client-thread calls directly one of the DebugTextRenderer or DebugShapeRenderer methods to render text or shapes
  4. the DebugTextRenderer and DebugShapeRenderer singletons collect a whole frame's worth of text elements and shapes and once per frame, create a single RenderDebugText and RenderDebugShapes message, so at most only 2 messages are sent into the render thread per-frame from each client-thread, not one message per shape and text element, that's a very important optimization!
  5. Once per render-frame, the DebugGraphicsHandler processes incoming RenderDebugText and RenderDebugShapes by calling the CoreGraphics::TextRenderer and CoreGraphics::ShapeRenderer singletons

That's it basically. Nebula3 applications can add their own functionality to subsystem threads by following the described pattern.

With the first naive implementation we stumbled across an obvious problem: when the main-thread runs slower then the graphics thread, debug shapes and text would start to flicker, since the render thread would only receive render-debug-messages every other frame. So we had to add a way to identify shapes and text elements by their origin-thread-id, and keep them around until the next message comes in from the same thread, but this was a trivial thing to do.

A positive effect is that debug visualization no longer needs to happen at a specific point in the render loop. This was a problem in Nebula2/Mangalore where classes had to provide an "OnRenderDebug()" method which was called by the rendering system from within the render loop. Instead debug visualization can now happen from anywhere in the code (although at the cost of some more memory and communications overhead, but especially debug visualizations is an area where convenience and ease-of-use is more important then raw performance).

FYI, this is how the NIDL-file looks like, which defines the messages of the DebugRender protocol:

<?xml version="1.0" encoding="utf-8"?>
<Nebula3>
    <Protocol namespace="Debug" name="DebugRenderProtocol">
        <!-- dependencies -->
        <Dependency header="util/array.h"/>
        <Dependency header="threading/threadid.h"/>
        <Dependency header="coregraphics/textelement.h"/>
        <Dependency header="debugrender/debugshaperenderer.h"/>

        <!-- render text string on screen for debugging -->
        <Message name="RenderDebugText" fourcc="rdtx">            
            <InArg name="ThreadId" type="Threading::ThreadId"/>
            <InArg name="TextElements" type="Util::Array<CoreGraphics::TextElement>" />
        </Message>

        <!-- render debug shapes -->
        <Message name="RenderDebugShapes" fourcc="rdds">
            <InArg name="ThreadId" type="Threading::ThreadId"/>
            <InArg name="Shapes" type="Util::Array<CoreGraphics::Shape>" />
        </Message>

    </Protocol>
</Nebula3>    
    
This will be compiled by the Nebula3 NIDL-compiler-tool into one C++ header and one source file (debugrenderprotocol.h and debugrenderprotocol.cc). 

I hope to have a new source drop out "really-soon-now", so you can check for yourself what I'm actually talking about :)

8 Sept 2008

Mercenaries 2

I'm currently having a lot more fun with Mercs 2 then I ever had with GTA4. Proves that blowing up shit is a lot more important then story in sandbox games. At least to me he. The game is a bit rough around the edges and has a number of minor bugs and glitches, but all the important stuff (controls, gun feedback, immersion, frequency of oh-shit moments) is much better then in GTA (IMHO of course). Battling a group of 3 or 4 heavy tanks in an army-occupied city only with RPGs and C4 is an absolutely exhausting experience, but so much fun with buildings crumbling left and right, tank shells and RPGs whizzing by and the distinctive sound of distant sniper fire over the general combat noise. Of course the player could opt to level the entire city with a few air strikes, but that would cost a lot of civilian lifes and wouldn't be well received by the Guerilla. And besides, air strikes are freaking expensive ;)

12 Aug 2008

OpenGL 3.0 (more like 2.2)

Oh the drama! The long-awaited OpenGL 3.0 spec has been released today after years of fruitless discussions in the ARB and Khronos Group, and - surprise - it is NOT the fresh and clean re-write that was promised, but instead only a minor release with a few extensions moved into the core feature set, and a few old features marked as depreciated. Nobody but the design commitee exactly knows what justifies the version number jump other then pretending progress.

In the old days of DX3 and DX5 I was a big fan of OpenGL, which was so much cleaner and easier to use compared to D3D. I still remember the horror of porting Urban Assault to DX5 in '97). I despised how Microsoft was completely rewriting the DirectX API every year. In hindsight that was their best decision, DX didn't have to care about old baggage and slowly got better until the excellent DX9, while the once elegant OpenGL was buried under a heap of vendor-specific extensions over the years, resulting in the mess it is today.

Somewhere inside I'm a bit sad to see OpenGL slowly dwindle into oblivion, on the other hand I didn't really care about it for the last 4 or 5 years (I think I stopped caring when I attempted to write an OpenGL renderer for Nebula2 and realized that I had to use a dozen-or-so extensions just to get DX9's core features).

2 Aug 2008

MGS4

*** WARNING, SPOILERS AHEAD ***

Alright, first play-through done. Debriefing showed 20 hours playtime. Overall an amazing experience, even if it felt like only one third of it was actual game-play and the rest was watching cutscenes. The game starts really slow and in the first and second chapter I actually had to motivate myself to continue playing. Then with the ending cutscene of chapter 3, the story suddenly becomes interesting and the rest of the game is an amazing ride, only problem is, there are only 2 chapters left at this point.

The presentation of the game during cut-scenes is jaw-dropping. The quality of the characters models, facial animation, motion capture performances is almost unbelievable. The direction and story-telling of the cut-scenes provides better entertainment then most good action movies. Unfortunately, for every one scene of pure awesomeness (basically every scene with Old Snake AND Liquid Ocelot) there two scenes which are outright cheesy and a real pain to endure.

The story wraps up all the events of the previous MGS games and can be a bit hard to follow since from time to time, the story-telling switches into some sort of white-board-mode and floods the player with details (which provide interesting background information but would IMHO be better if told through codec conversations). The other problem is, that the back-story of EVERY F*CKING PERSON which EVER showed up in ANY MGS game is brought to its end, turning parts of the story into a gigantic soap-opera. I think 4 or 5 hours of the cutscenes could have been removed and I wouldn't have missed a thing.

In its core, MGS4 tells the story of 2 old men, Solid Snake (the hero from MGS1, and clone of Naked Snake from MGS3), who's suffering from accelerated aging programmed into his genes (as far as I understood that part of the story), and Liquid Ocelot, a chimera of Liquid Snake and Revolver Ocelot. Snake is portraied as a cynical, pragmatic asshole, which really makes him a likeable protagonist. But the real hero of the game (to me) is Ocelot, he's an old man, just as Snake, but has aged naturally, and he (or better his Liquid Snake part) has all the energy and vision left which Snake seems to have lost long ago. In the end it's all about handing the world over to the next generation (shame that thist next generation mainly consists of whiners and wimps).

On to game-play: This is the disappointing part IMHO, which is a real shame since MGS4 is still considered a game, not a movie. The core sneaking elements are the same as in MGS2 and MGS3, which is of course a good thing. You can still hide in lockers or under cardboard boxes, shake knocked-out or dead enemies to loot them, and the camouflage element from MGS3 is also there, but now in its science-fiction version as the octocamo suit (basically, adaptive optical and infrared camouflage).

The survival elements from MGS3 have been removed, and replaced with shooter gameplay, and that's where I still have my biggest problem with because a sneaking element has been removed in favour of an action-oriented gameplay element. I'm a big shooter fan so that wouldn't be much of a problem for me IF PROPERLY IMPLEMENTED. Problem is, the shooting stuff is only very basic and for a shooter, the controls feel extremely awkward. If gun handling would only be half as good as in good tactical shooters like Rainbow Six Vegas I would be sold, but unfortunetaly it's far from it. Especially cover and firing from cover have been evolved dramatically since the times of MGS2 and MGS3, but in MGS4, firing from cover is still as awkward as in these old games. Also, the big number of guns offered to the player is kinda useless. There's no real reason to choose a submachine gun over an assault rifle. A sniper rifle is just as easy to use close-range as any other gun. An assault rifle with an optical sight is just as good as a sniper rifle over long range, because the levels are relatively small anyway.

I wish the developers would have put all the work which has gone into the shooters aspects into new sneaking elements instead. Compared to the last Splinter Cell, the game-play of MGS has been falling behind. Look at all the Bond-gadgets Sam Fisher has at its disposal, or all the ways Fisher can approach a door to investigate what's behind it before sneaking into a room... (man, speaking of Splinter Cell - I think I need to re-play Double Agent soon...).

But even if I don't particularly like the direction MGS has taken, it's still truly exceptional entertainment and one of the best games of the current console generation. And I think it was the right decision to put Snake to rest and with him the MGS series, since at least I am ready for some fresh wind in the sneaking genre (although it seems there will be lots of trial-and-error ahead as the S.C.Conviction and Assassins Creed "debacles" show).

29 Jul 2008

PSN hmm...

I have removed the PSN "Portable ID" on the left side. Seems like it's just a static display for my user name and avatar picture doh... And I can't believe that Sony doesn't let me download (buy even!) additional gamer pictures from PSN (or I simply didn't find them? even after the re-design it isn't exactly easy to find stuff on the PSN shop, even though there's not THAT much stuff there). I'd really love to have an MGS4 gamer pic. It's all those little "un-important" things which make me appreciate the online-integration of the 360 (on the other hand, if I had gone straight from the PS2 to the PS3 I probably wouldn't even care, but the 360 really has spoiled me). Sony really needs to get all this simple and obvious stuff fixed first instead of wasting their time and resources on Home IMHO.

25 Jul 2008

Kept ya waiting huh?

Sooo I finally bought a PS3 yesterday. MGS4 did it (I basically bought a PS2 just for MGS3, so this was inevitable). My local GameStop was sold out completely (the guy there said he won't get any new PS3's until the new 80GB model comes out end of August). MediaMarkt still had one MGS4 bundle left so I took this one.

First impressions:
  • OMG its huge! Ok, power supply is integrated, but the 360 looks tiny compared to this thing.
  • I was actually planning to wait for the white PS3 (black entertainment devices are plain ugly IMHO) but it looks like it will never arrive in Europe.
  • System and WLAN setup was fast and flawless.
  • Surprisingly, picture quality is worse on my TV then my 360. I have an 2 year old 32" LCD Sony Bravia, and have the PS3 connected through HDMI. The 360 is connected through VGA. There's a lot of edge aliasing going on in the PS3 picture. I suspect that's because the VGA connection runs at the native display resolution (1280x768 or so), while the 720p HDMI connection has a non-native display resolution so that the TV's scaler kicks in. Basically looks the same like when I had connected the 360 through component. Still I would think that connecting one Sony device to another Sony device through a digital connection would generate a better picture.
  • During a game, the TV signal is lost from time to time (maybe once per hour for about 2 seconds). WTF? I've read about this in forums but thought this would only happen with some obscure TV's. Again this is a Sony TV and a Sony console. This definitely didn't happen when I had my 360 running through HDMI.
  • The XMB looks slick and feels much more responsive compared to the 360's dashboard.
  • Way too many system settings... half of which don't interest me at all (mouse sensitivity???)
  • The system didn't notify me that a software update is available. How is a typical user supposed to notice that he isn't uptodate?
  • Not impressed by the web browser, feels terribly slow and unusuable without a mouse...
  • Downloading the new firmware took FOREVER (20 minutes or so?), and displayed another progress bar for installation which took another couple of minutes. I don't remember waiting for more then 3 minutes for a system update on the 360...
  • MGS4 starts: ugh more installation... this isn't funny.
  • Start screen with all those little flowers looks REALLY messy on my TV...
  • Ok, WTF is this David Hayter interview shit before starting a new game?
  • Hmm, ingame graphics isn't quite as impressive as I remember from the trailers...
  • Snake's character model looks really great though.
  • Adaptive camo is nice.
  • 2 hours later...
  • MGS4 doesn't know whether it want's to be a shooter or a sneaker... definitely too much emphasis on guns and shooting, which would be fine with me if shooter controls wouldn't be horribly broken, getting out of that abandoned hotel with the hot chick's team was an extremely frustrating experience
  • story didn't exactly grab me so far... cutscenes are cringe-worthy when they try to be funny
Guess MGS4 needs to grow on me. When I launched MGS3 I was immediately fascinated and couldn't stop playing. This didn't happen so far with MGS4. Will post more impressions when I finished the game.

22 Jul 2008

Ninja Gaiden DS

Ninja Gaiden Dragon Sword for the Nintendo DS is finally available in Germany (Dragon Sword - DS - gettit?). How the game is controlled using the stylus is astonishing, breathtaking, eye-opening, dare I say: revolutionary.

They nailed the controls, plain and simple. Just as the 2 Xbox games, the DS version is all about rythm, but instead of hitting button combos, playing NG DS is more like painting a picture with a brush. A few well-placed "brush strokes" here and there, a few taps over there, and another group of Spider Clan ninjas are history. Something complex like an Izuna Drop is simply done by a down-up-up-stroke over the enemy (the first down-stroke is a more-or-less normal sword attack, the first up-stroke launches the enemy into the air, and the second upstroke has Ryu jump into the air, grab the enemy and whirl it into the ground.

The graphics are beautiful for what the DS can do. The backgrounds are 2D bitmaps, but give a good 3D-ish illusion since the 3D characters can move into the screen with the correct perspective projection.

Best DS game in a while, shame that it got so little attention from the DS crowd.

20 Jul 2008

10 Years After

Andreas, an old friend of mine, noticed that Urban Assault went Gold (or RTM - Release To Manufacture as it was called at Microsoft at that time) on July 13th 1998, almost exactly 10 years before Drakensang went Gold (which was at the 10th of July 2008). He even sent me the original anouncement mail I sent back to Germany. Andreas and I had been "stationed" over at Seattle at that time to apply the final fixes and polish to UA, since flying the programmers over to NA was probably more convenient in 1998 then FTP'ing complete daily game builds over ISDN.

Apart from the usual post-project cleanup stuff I can now start to work on new and exciting technology stuff again, improving our tool-chain, continue working on Nebula3, and dust off those Wii and 360 devkits which probably feel a bit neglected due to our focus on finishing Drakensang.

We're also a licensed PS3-developer now, a N3 port hasn't started yet but from looking through the SDK docs the programming environment doesn't seem to be too bad. The underlying philosophy or "style" is a bit different then the 360 SDK (just like for instance Win32 has a different style then Unix). But all in all the PS3 SDK looks complete and actually quite usable. It's also quite obvious from looking through the release note history that the PS3 SDK has improved a lot since the PS3 launch. I think it will be relatively easy to get "something" up and running on the PS3, squeezing the last bit of performance out of the bitch however will be something completely different I'm sure ;)

The Dead Rising port to the Wii might be the most interesting news I took out of E3. It made me play the 360 version again, and this is one of those game which get better and better over time.

I totally underestimated the importance of the books one can find in the different book stores. In the past I ignored them because they take up valuable inventory slots. But what they actually do is they make every single slot much more valuable. For instance there is a book which triples the time edged weapons can be used until they break, and another which triples the usage time of items from home-improvement-stores. And the effects actually stack. For instance the mini-chainsaws which are unlocked after killing the clown-psycho (pretty much the most powerful melee weapons in the game) fall under both categories!

The Wii port of Dead Rising makes immediate sense with all the special attacks or shaking off zombies for instance. These should translate very well to waggle. I'm concerned though about the number of zombies. It's just not Dead Rising without hundreds of zombies on screen. The first batch of Wii screens have a suspicious lack of zombies in them...

27 Jun 2008

Status Update

We're currently in release candidate mode for Drakensang which basically means we're more or less on standby until QA hits the alarm button because they found a show-stopper bug. The good thing is that I can spend most of my time at work actually playing the game from start to finish. Obviously I'm biased, but a truly wonderful game it has become. Of the games I've been involved with in my life, Drakensang is probably the one I am most proud of.

After Drakensang has gone gold I think I can spend more time with Nebula3 and the blog again.

In stark contrast to Drakensang I've spent the last couple of evenings at home mainly with Ninja Gaiden 2. I did a normal play-through on Path Of The Warrior, one Dragon Sword weapon run on Acolyte (which really was too easy), and am currently half-way through a Lunar Staff weapon run on Warrior with many many play-throughs to follow.

Coming straight from Ninja Gaiden Black I had some trouble to adapt to the new style of the game. In the original, fights are against 3 or 4 enemies at once. In NG2 a typical fight is against 10..15 enemies at once. The combo and hit-recovery timing is a bit different in NG2, which made the control feel sluggish to me at the beginning. Also, the polishing grade couldn't be more different. NGB was probably one of the most polished games in history, while the lack of polish in NG2 is quite apparent unfortunately (yes, the camera does indeed have some issues, I had 2 freezes so far, and the game goes into some sort of slow-motion-mode in heavy fighting situations).

But despite these flaws NG2 still trumps any other fighting game I have played so far because the rest is so f*cking great. The core game mechanics are so extremely carefully tuned (and fine-tuned compared to NGB) that the laughable story and less-then-stellar boss-fights are simply not that important. NG2 shines where the player spends the most time with: fighting hordes of ninjas and monsters. The actual combat is so intense and incredibly satisfying that I want to start a new game immediately after finishing the last. There is one special moment in the game involving a staircase and maybe 100 or 200 ninjas which is simply jaw-dropping (and is probably my most favourite gaming moment of all time).

If there ever was a flawed diamond among games, it is Ninja Gaiden 2. Even with its flaws it is a really exceptional game, but if Itagaki and his team of ronin find a way to improve the game as they did with Ninja Gaiden Black the result would be ultimate perfection.

11 Jun 2008

NG2 (YES!!!)

I finally got the game! I ordered the UK version because NG2 isn't available in Germany anyway. I just rushed quickly through the first 2 chapters yesterday evening and holy shit this game is epic. DMC4 was fun, but NG2 is simply on a whole different level in every aspect (yeah, I'm a biased Ninja Gaiden fanboi but anyway...).

One thing I noticed immediately is that the rythm of the game is a bit different then Ninja Gaiden Black. It was a bit harder for me to pull off an Inazuna Drop in the beginning. Ryu's control feels a bit heavier (or one could say, a little bit more realistic). And Jesus are the enemies aggressive right from the beginning. Hold a block for more then 2 seconds and the bastards will rip you apart :o) But due to the new partial life regeneration and plenty of save points, the game feels a bit easier in Way Of The Warrior compared to NGB on Normal difficulty (so far at least).

The gore is a bit too over the top for my taste. Nothing against some blood-spilling, but I felt a bit uneasy after I had reduced a group of 10..20 spider-clan ninjas into a bloody heap of arms, legs and torsos and some glibberish things which I don't want to inspect too closely. At least for the first time. Afterwards it becomes kind of a routine. This is basically the game version of Kill Bill (wow, how cool would that be: playing as a yellow-clad Uma Thurman through NG2...).

I love this game.

27 May 2008

What Camera Issues?

OMG I can't believe reviewers still bitch about the camera in Ninja Gaiden 2. If it's the same as in NG1 (which they complained about as well when it came out) then it's just perfect. Just tap the right trigger and the camera snaps back behind Ryu. That's the secret to successful camera control in Ninja Gaiden ;)

13 May 2008

Slides to my Quo Vadis presentation

Here are the slides to the presentation I gave at the Quo Vadis developer conference last week in Berlin. It's all in German, but it has some pretty pictures in it ;) Unfortunately the Save-As-PDF feature in Powerpoint doesn't embed the video files (only pictures of them), so I packed everything into a zip archive.

Here's the link.

12 May 2008

More Memory Debugging Notes

A few memory debugging tips which I found useful during the past few weeks:
  • Always look at ALL process heaps when doing memory debugging. Turns out we still had a rather obvious memleak in Drakensang. It didn't show up in our own memleak dumps because it happened in an external heap created by SpeedTree, and we can only track allocations going through our own memory subsystem and through the CRT. After dumping a summary of all heaps returned by GetProcessHeaps() before and after loading a level one of the "external" heaps showed a memleak of up to 10 MB per level-load! The heap disappeared after compiling without SpeedTree-support, so the culprit was easy to identify (of course it wasn't SpeedTree's fault, but a bug in our own tree instancing code, which was fixed in half an hour, and admittedly it was a very obscure special case to not have showed up as a memleak in our own dumps as well).
  • Use memory allocation hooks in 3rd party libs if they support it. In fact I wish all libs would support a mechanism to re-route memory allocations to my own routines, if only for debugging reasons. There's no way to track memory allocations happening inside XACT for instance (that I'm aware of?!).
  • Write an automatic stress test for your app early in the project. We have started a while ago to run continuous playthrough sessions with a hot-seat system where our testers hammer the same game-session 24/7. The more subtle memory leak bugs often only trigger after 10 or 20 hours of continuous playtime. Despite this continuous testing I also wrote a little stress test mode, where the game loads a level, lets the level run for 30 seconds and then load another level, ad infinitum over night. This may amplify bugs which happen during load time, and may attenuate bugs happening during normal game play (the SpeedTree related memleak mentioned above went critical much earlier in the stress-test as in normal gameplay-sessions). A water-proof generic record/replay mechanism in the engine would be helpful as well (we don't have that in Drakensang, but this may be a feature we might look into in the future).
  • A fixed memory layout is actually helpful on the PC as well. Drakensang doesn't have this, but I'm becoming more and more a fan of a fixed memory layout. Set aside N megabytes for C++ objects, another chunk of memory as temporary load/save buffer, fixed memory buffers for the different resource types, allocate those blocks at the beginning of the game either as non-growable heaps, or as pre-allocated virtual memory blocks with custom-taylored memory management - and let the game crash if any of the heaps is exhausted. The main reason why this is a good idea is not so much finding memory leaks, but to prevent resource usage to grow out of control during development.
I have collected a lot of ideas for the Nebula3 memory subsystem which I will play around with as time permits.

4 May 2008

GTAIV

Dammit I wish I had more time to play GTA... In the first hour or two, the game actually wasn't the jaw-dropper I had expected (other games on the 360 definitely look better, and the walking controls are just as shitty as in the old GTA's). But boy has the game grown on me after a little while... there's not one session where I don't discover some hilarious shit to do in the game... there's so much attention to detail in the tiniest, most forgotten and dark corners of Liberty City... And some of the missions I have encountered so far are simply epic (the gunwork has improved dramatically over GTA3:SA, although the new cover system could have used a bit more work). Some of the elements I liked in San Andreas have been removed unfortunately (like the character development stuff), but all the new features and improvements definitely compensate for the loss. R* have delivered again.

28 Apr 2008

COD4 DLC woes...

Here's an interesting side-effect of the COD4 map pack: ever since I downloaded the map pack my K/D-ratio started to suffer terribly. Once I deleted the map pack from my HD the ratio immediately improved. Looks like the map-pack was mainly purchased by hardcore players, so that there are not enough noob players like me available for fair match-making. So the DLC actually made the game less enjoyable for me. Well, I didn't like Chinatown anyway ;)

20 Apr 2008

Memory Issues

The 2 most critical issues with dynamic memory allocation seem to be memory leaks and memory fragmentation. While memory leaks can be discovered and fixed easily, memory fragmentation is harder to come by since it usually only shows up when the application runs for many hours. At some point, allocation of large memory blocks may fail even though the total amount of free memory is more then enough to satisfy the request. I have recently added some counter-measures to Drakensang, which I have ported over to Nebula3 during the last 2 hours. The basic idea is to go away from the general Memory::Alloc()/Memory::Free() and to group allocations by usage pattern into different heaps, so that small, large, short-lived or long-lived memory blocks are not allocated all from the same heap. Nebula3 now defines various global HeapTypes, which need to be provided as arguments to the Memory::Alloc() and Memory::Free() functions. Platform ports of Nebula3 are free to define additional platform-specific heap types, as long as only platform-specific code uses those new types. The platform-specific code also has full control over the initialization of the global heaps (like the initial heap size, or whether the heap is allowed to grow or not), which is especially important for console platforms with their restricted amount of memory and no page-file-swapping. For now I have arbitrarily defined the following heap types:
  • DefaultHeap: for stuff that doesn't fit anywhere else
  • ObjectHeap: for RefCounted objects (this may be split into more HeapTypes in the future)
  • SmallBlockHeap: general heap for small allocations
  • LargeBlockHeap: general heap for "large" allocations (several megabytes)
  • ResourceHeap: for long-lived resource data, like animation keys
  • ScratchHeap: for short-lived memory blocks
  • StringHeap: for string-data
  • StreamDataHeap: used by classes like MemoryStream or ZipFileStream
Those heap types may change in the future, I'm not sure yet, whether these are too many or too few types. I'll add some more status information to the memory HTTP debug page handler with statistics data about the different heap types, so it should be easy to see whether this configuration is good or not. Also, the Nebula3 code doesn't have that many calls to Memory::Alloc() or Memory::Free() (around 20..30 or so), so it's relatively easy to try out different usage patterns. Of course it's still possible to create a special local heap using the Memory::Heap class. On a side note: as you may have noticed, my posting frequency has suffered a lot recently. The reason is that I'm now 110% focused on Drakensang, there's just not enough time left to do a lot of work on N3 or on the blog. Don't expect this to change until around mid-July :)

31 Mar 2008

In Oblivion

I can't believe I started playing Oblivion again. I finally want to finish the main quest, last time I played through nearly all guild quests and then didn't have enough motivation left to go on with the main story... I was immediately sucked into the game again. Dungeon crawling is where Oblivion really shines, even more then Morrowind (which I still consider the better overall game). I have created a new dark-elf nightblade which I'm playing as a stealthy magic-wielding, arrow-shooting bad-ass assassin :o)

I also started to play around a bit with SVG, since I was looking for a cheap way to render diagrams for the Nebula3 debugging and profiling subsystem. I'll go into more details in a later post, but let me just say that SVG kicks ass and is exactly what I was looking for. Fun-fact: the only browser that can't render SVG out of the box is IE7 (Firefox, Opera and Safari are fine).

19 Mar 2008

Complexity

Just came across this citation on Slashdot:

"Any third-rate engineer or researcher can increase complexity; but it takes a certain flair of real insight to make things simple again." - E.F.Schumacher.

Every programmer and game designer should bow before these mighty words of wisdom. Guess I need to read his book "Small Is Beautiful" now.

16 Mar 2008

Gaming Weekend

I'm not feeling very productive this weekend, might have to do with the shitty weather in Berlin...just the right weather to stay at home and play some games. I bought Bully for my 360 last week. I didn't play the original on the PS2, and although I read that the game "might freeze on some older consoles" I gave it a try. And guess what, it froze on me about 2 hours into the game, loosing at least 1 hour since the last save. I'm waiting now for the patch RockStar promised should come last week, since the game really looks like fun. Lousy certification job though, this bug shouldn't have slipped through.

Played through the first chapter of Rainbow Six Vegas again... and I must say the game hasn't aged very well. The graphics is a bit too dirty, it's very hard to make out enemies against the background at least in the Mexican setting in the beginning. There's still no better cover system in any other game though, but I had a hard time to adapt to the controls again (blew myself up several times because 'B' is 'throw grenade' instead of crouch). I think it was the right decision to make the graphics in Vegas 2 that much cleaner, even though I was turned off at first by the "cartoony" look of the screenshots.

After that mildly frustrating experience I played some more Ninja Gaiden Black on Hard difficulty. I finally want to kick Alma's ass. This game just gets better the more you play it. The structure of the game is very different on Hard difficulty. There are new enemies, items are distributed differently in the world, you get weapons and their upgrades only much later, and the boss fights are much more challenging because the bosses are now accompanied by minions. It is amazing how well balanced the rock-scissor-paper system in Ninja Gaiden is. A different weapon can make a subtle but very important difference for a specific enemy type. For instance, at first glance, the new cat-demons in hard difficulty just look like a more annoying version of the Black Spider Ninjas, but while the ninjas can be controlled very nicely with the Lunar staff, I feel much more comfortable fighting the cat-demons with the nunchuk (need to do some experimentation with the Vigorian Flail though). Hard difficulty also forces you to learn blocking, jumping and rolling much more efficiently to avoid attacks. I recently downloaded Ninja Gaiden as an Xbox Original title even though I also own the disc version. Not having to swap discs for a quick round of Ninja Gaiden fun is well worth the 1200 points IMHO :)

I also played an hour of Crackdown. This game is still so much fun... I was playing around with some of the more advanced stuff I didn't use during my earlier play-throughs. For instance, specifically aiming for body- or car-parts (head-shots with the sniper-rifle over insane distances, or causing havoc on the highways by blowing up the gas-tank or tires of passing vehicles). I read somewhere that GTA4 will use a similar targeting system, if true this would be great, I really started to appreciate the added targeting functionality in Crackdown, especially when playing a bit more tactical instead of blowing up the whole perimeter Terminator-style.

I finally ended the day with a few rounds of COD4 multi-player. I'm now on my second prestige-round. I guess I have finally finished my transition from a keyboard/mouse- to a gamepad-FPS player. I can pull off shit with the gamepad now which I deemed impossible one year ago :)

15 Mar 2008

Vertex Component Packing

I finally got around to optimize vertex component sizes for Drakensang. A typical vertex (coords, normal, tangent, binormal, one uv-set) is now 28 bytes instead of 56 bytes, a light-mapped mesh vertex (2 uv-sets) is now 32 bytes instead of 64, and a skinned vertex has been reduced to 36 bytes instead of 88. With this step I have finally burned all DX7-bridges, all our projects have a 2.0 minspec now (since Radon Labs also does casual titles, we had to support Win98 and DX7 for much too long). As a result, the size of all mesh resources in Drakensang has been reduced from from a whopping 1.2 GByte down to about 650 MByte. This also means reduced loading times and better vertex-through-put when transferring vertex data to the graphics chip. Some vertex components need to be scaled to the proper range in the vertex shader, but this is at most one multiply-add operation per component.

I also implemented support for the new vertex formats in Nebula3. N3 always had support for packed vertex components, so all I had to do was to add a few lines to the legacy NVX2 mesh loader and fix a few places in the vertex shaders for unpacking normals and texcoords.

Here's how the vertex components are now packed by default:
  • Position: Float3 (just as before)
  • Normal, Tangent, Binormal: UByte4N (unsigned byte, normalized)
  • TexCoord: Short2 as 4.12 fixed point
  • Color: UByte4N
  • Skin Weights: UByte4N
  • Skin Joint Indices: UByte4
Normals, tangents and binormals and tex-coords need an extra unpacking instruction in the vertex shader. Skin weights need to be "re-normalized" in the vertex shader because they loose too much precision:

float4 weights = packedWeights / dot(packedWeights, float4(1.0, 1.0, 1.0, 1.0));

This will make sure that the components add up to 1.0. In case you're wondering, the dot product is equivalent with s = (x + y + z + w), it's just much more efficient, because the dot product is a native vertex shader instruction (although I must confess that I didn't check yet whether fxc's optimizer is clever enough to optimize the horizontal sum into a dot product automatically).

5 Mar 2008

Nebula3's Multithreaded Rendering Architecture

Alright! The Application Layer is now running through the new multithreaded rendering pipeline.

Here's how it works:

  • The former Graphics subsystem has been renamed to InternalGraphics and is now running in its own "fat thread" with all the required lower-level Nebula3 subsystems required for rendering.
  • There's a new Graphics subsystem running in the application thread with a set of proxy classes which mimic the InternalGraphics subsystem classes.
  • The main thread is now missing any rendering related subsystems, so trying to call e.g. RenderDevice::Instance() will result in a runtime error.
  • Extra care has been taken to make the overall design as simple and "fool-proof" as possible.
  • There's very little communication necessary between the main and render threads. Usually one SetTransform message for each graphics entity which has changed its position.
  • Communication is done with standard Nebula3 messages through a single message queue in the new GraphicsInterface singleton. This is an "interface singleton" which is visible from all threads. The render thread receives messages from the main thread (or other threads) and never actively sends messages to other threads (with one notable exception on the Windows platform: mouse and keyboard input).
  • Client-side code doesn't have to deal with creating and sending messages, because it talks through proxy objects with the render thread. Proxy objects provide a typical C++ interface and since there's a 1:1 relationship may cache data on the client-side to prevent a round-trip into the render thread (so there's some data duplication, but a lot less locking)
  • The Graphics subsystem offers the following public proxy classes at the moment:
    • Graphics::Display: setup and query display properties
    • Graphics::GraphicsServer: creates and manages Stages and Views
    • Graphics::Stage: a container for graphics entities
    • Graphics::View: renders a "view" into a Stage into a RenderTarget
    • Graphics::CameraEntity: defines a view volume
    • Graphics::ModelEntity: a typical graphics object
    • Graphics::GlobalLightEntity: a global, directional light source
    • Graphics::SpotLightEntity: a local spot light
  • These proxy classes are just pretty interfaces and don't do much more then creating and sending messages into the GraphicsInterface singleton.
  • There are typically 3 types of messages sent into the render thread:
    1. Synchronous messages which block the caller thread until they are processed, this is just for convenience and only exists for methods which are usually not called while the main game loop is running (like Display::GetAvailableDisplayModes())
    2. Asynchronous messages which return immediately but pass a return-value back at some later time. These are non-blocking, but the result will only be available in the next graphics frame. The proxy classes do everything possible to hide this fact by either caching values on the client side, so that no communication is necessary at all, or by returning the previous value until the graphics thread gets around to process the message).
    3. The best and most simple messages are those which don't require a return value. They are just send off by the client-side proxy and processed at some later time by the render thread. Fortunately, most messages sent during a frame are of this nature (e.g. updating entity transforms).
  • Creation of Graphics entities is an asynchronous operation, it is possible to manipulate the client-side proxy object immediately after creation even though the server-side entity doesn't exist yet. The proxy classes take care about all these details internally.
  • There is a single synchronization event per game-frame where the game thread waits for the graphics thread. This event is signalled by the graphics thread after it has processed pending messages for the current frame and before culling and rendering. This is necessary to prevent the game thread from running faster then the render thread and thus spamming its message queue. The game thread may run at a lower - but never at a higher - frame rate as the render thread.

Here's some example code from the testviewer application. It actually looks simpler then before since all the setup code has become much tighter:
using namespace Graphics;
using namespace Resources;
using namespace Util;

// setup the render thread

Ptr<GraphicsInterface> graphicsInterface = GraphicsInterface::Create();
graphicsInterface->Open();

// setup and open the display
Ptr<Display> display = Display::Create();
// ... optionally change display settings here...
display->Open();

That's all that is necessary to open a default display and get the render thread up and running. The render thread will now happily run its own render loop.

To actually have something rendered we need at least a Stage, a View, a camera, at least one light and a model:

// create a GraphicServer, Stage and a default View
Ptr<GraphicsServer> graphicsServer = GraphicsServer::Create();
graphicsServer->Open();

Attr::AttributeContainer dummyStageBuilderAttrs;
Ptr<Stage> stage = graphicsServer->CreateStage(StringAtom("DefaultStage"),
Graphics::SimpleStageBuilder::RTTI,

dummyStageBuilderAttrs
);

Ptr<View> view = this->graphicsServer->CreateView(InternalGraphics::InternalView::RTTI,
StringAtom("DefaultView"),
StringAtom("DefaultStage"),
ResourceId("DX9Default"),
true);

// create a camera and make it the active camera for our view
Ptr<CameraEntity> camera = CameraEntity::Create();
camera->SetTransform(matrix44::translation(0.0f, 0.0f, 10.0f));
stage->AttachEntity(camera.cast<GraphicsEntity>());
view->SetCameraEntity(camera);

// create a global light source
Ptr<GlobalLightEntity> light = GlobalLightEntity::Create();
light->SetTransform(matrix44::rotationx(n_deg2rad(-70.0f)));
stage->AttachEntity(light.cast<GraphicsEntity>());

// finally create a visible model
Ptr<ModelEntity> model = ModelEntity::Create();
model->SetResourceId(ResourceId("mdl:examples/eagle.n2"));
stage->AttachEntity(model.cast<GraphicsEntity>());

That's the code to setup a simple graphics world in the asynchronous rendering case. There are a few issues I still want to fix (like the InternalGraphics::InternalView::RTTI thing). The only thing that's left is to add a call to GraphicsInterface::WaitForFrameEvent() somewhere into the game-loop before updating the game objects for the next frame. The classes App::RenderApplication and App::ViewerApplication in the Render layer will actually take care of most of this stuff.

There's some brain-adaption required to work in an asynchronous rendering environment:

  • there's always a delay of up to one graphics frame until a manipulation actually shows up on screen
  • it's hard (and inefficient) to get data back from the render thread
  • it's impossible for client-threads to read, modify and write-back data within one render-frame

For the tricky application specific stuff I'm planning to implement some sort of installable client-handlers. Client threads can install their own custom handler objects which would run completely in the render-thread context. This is IMHO the only sensible way to implement application specific graphics functionality which requires exact synchronization with the render-loop.

I've had to do a few other changes to the existing code base for the asynchronous rendering to work: Mouse and keyboard events under Windows are produced by the application Windows (which is owned by the render thread), but the input subsystem lives in the game thread. Thus there needs to be a way for the render thread to communicate those input events into the main thread. I decided to derive a ThreadSafeDisplayEventHandler class (and ThreadSafeRenderEventHandler for the sake of completeness). Client threads can install those event handlers to be notified about display and render events coming out of the render-thread.

The second, bigger, change affected the Http subsystem. Previously, HttpRequestHandlers had to live in the same thread as the HttpServer, which isn't very useful anymore now that important functionality has been moved out of the main thread. So I basically moved the whole Http subsystem into its own thread as well, and HttpRequestHandlers may now be attached from any thread. There's a nice side effect now that a Http request only stalls the thread of the HttpRequestHandler which processes the request.

There's still more work to do:

  • need to write some stress-tests to uncover any thread-synchronization bugs
  • need to do performance investigations and profiling (are there any unintended synchronizations issues?)
  • thread-specific low-level optimization in the Memory subsystem as detailed in one of my previous posts
  • optimize the messaging system as much as possible (especially creation and dispatching)
  • I also want to implement some sort of method to run the rendering in the main thread, partly for debugging, partly for platforms with simple single-core CPUs

Phew, that's all for today :)

28 Feb 2008

COD4 FTW!

Hmm, just noticed that COD4 is now my personal most-played game, replacing Oblivion which ruled at the top spot since I bought my 360 in 2006. The funny thing is that I remember Oblivion as that huge time-eater which I played the whole summer of 2006, while COD4 feels like I just barely started to really play it. Scary.

I also finally finished the final boss in Prince Of Persia (the XBLA remake)! The combat system is surprisingly complex, attacking and blocking requires very good timing, a bit like the sword fights in Assassins Creed. Oh and I started with Lost Odyssey. Don't know what to think of it yet. It didn't exactly grab me, I'm 3 hours in or so, and there's just nothing happening in this game! Everything is so stretched out, and I keep thinking "man I could play some COD4 instead of struggling through this borefest"... and that's what I usually do about 10 minutes later...

I'm also currently on my third DMC4 play-through in Son-Of-Sparda mode. Great great game. Nice distraction until the king returns in June ;)

27 Feb 2008

Lowlevel Optimizations

I'm currently doing memory optimizations in Drakensang, and together with the new ideas from the asynchronous rendering code in N3 I'm going to do a few low-level optimizations in the memory subsystem over the next time in Nebula3. Here's what I'm planning to do:
  • Add a thread-safe flag to the Heap class, currently a heap is always thread-safe, but there will be quite a few cases now where it makes sense to not have the additional thread-safety-overhead in the allocation routines.
  • Add some useful higher-level allocators:
    • FixedSizeAllocator: This optimizes the allocation of many small same-size objects from the heap, it will pre-allocate big pages, and manage the memory within the pages itself. The main-advantage comes from the fact that all blocks in the page are guaranteed to be the same size.
    • BucketAllocator: This is a general allocator which holds a number of buckets for e.g. 16, 32, 48, ...256 byte blocks (the buckets are just normal FixedSizeAllocators). Small allocations can be satisified from the buckets, larger allocation go directly through the heap as usual.
  • Overwrite the new-operator in all RefCounted-derived classes to use the a BucketAllocator (however, I'll actually do some profiling whether this is actually faster then Windows' Low Fragmentation Heaps). This stuff will be behind the scenes in the DeclareClass()/ImplementClass()-macros, so there are no changes necessary to the class source code.
The biggest new feature (which depends on all of the above) is that I want to split RefCounted classes into thread-safe and thread-local classes. The idea is that a thread-local class promises that creation, manipulation and destruction of its objects happens from the same thread. A thread-local class can do a few important things with less overhead:
  • thread-local classes would create their instances from a thread-local BucketAllocator which doesn't have to be thread-safe
  • thread-local classes could use normal increment and decrement operations for their refcounting instead of Interlocked::Increment() and Interlocked::Decrement(). Since every Ptr<> assignment changes the refcount of an object this may add up quite a bit.
By far most classes in Nebula3 can be thread-local, only message objects which are used for thread-communication, and some low-level classes need to be thread-safe. I'm planning to enforce thread-locallity in Debug mode by storing a pointer to the local thread-name in each instance, and checking in AddRef, Release and other RefCounted methods whether the current thread-context is the same (that's a simple pointer-comparision, and it happens only in Debug mode).

The general strategy is to get away as far as possible from the general Alloc()/Free() calls, and to make memory management more "context sensitive" and also more static (in the sense that the memory layout doesn't change wildly over the runtime of the game). It's extremely important for a game application to set aside fixed amounts of memory right from the beginning of the project (and to let the game fail hard if the limits are violated), otherwise everything will grow without control, and towards the end of the project much time must be invested to cut everything back to reasonable limits.

19 Feb 2008

Nebula3 Februar 2008 SDK

Alright, here it is finally:

N3SDK_Feb2008.exe

Notable new features:
  • a "quick'n'dirty port" of our current Nebula2 modular character and animation system
  • PSSM-VSM shadow support for global light sources (which I'm not quite happy with yet)
  • some restructuring and cleanup in the Application layer
The internal Wii port is coming along nicely, Johannes got SQLite up and running on the Wii (which will also be helpful for other console ports), and apart from physics and collision detection, the Application Layer is now running on the Wii.

During the last 2 weekends I started to implement a new AsyncGraphics subsystem, which will put the Graphics subsystem and everything beneath it under its own thread. I'll write more about this in another post soon.

Finally here's a new screenshot of the new "demo character":

9 Feb 2008

COD4 Multiplayer

I have developed a serious addiction to COD4 multiplayer. Usually I'm not that big into competitive multiplayer, but I'm having my phases. Back in the old days, it was Counter Strike, a few years later I was excessively playing Battlefield2, and now for the last couple of weeks I'm totally hooked to COD4. On a console. With a game-pad. I tried the competitive multiplayer portions of Gears Of War, Rainbow Six Vegas and Halo3 before, and none of them really ticked with me in multiplayer like the old-school PC shooters. But COD4... one match did it. The reason - I think - is because COD4 is the love-child of CS and BF2, my two favorits of the past. Sessions feel very fast-paced, with an initial rush to the control-points, just like in CS. The class system and up-leveling is more like BF2, but it's more back-to-the-roots, especially the class system, which doesn't feature advanced classes like medics or engineers, instead, all classes are strictly offensive. Every single feature is extremely polished, and everything which would hinder the flow and speed of the game has been removed. And my personal killer-feature: you don't have to communicate a lot. Nothing breaks the immersion more then some jerks talking about some completely unrelated shit like their last GameStop-visit during a Team Deathmatch in some destroyed middle-eastern city. Fortunately the game plays just as well without headset. Big win.

2 Feb 2008

D3D Debugging

I just spent a bit of time debugging the D3D9 specific code. Running the test viewer under the D3D debug runtime and with the warning level to highest reveals 2 warnings:
  • redundant render state switches (which I'm ignoring for now, the frame shader system already helps to reduce redundant state switches a lot, and fixing the remaining state switch warnings would involve implementing a D3DXEffectStateManager, but before I do this I want to make sure that my own redundant state switch detection would actually be faster then D3D's)
  • more serious is the second warning: "render target was detected as bound, but couldn't detect if texture was actually used in rendering". This is only a real problem if you want to read from the same render target you're currently rendering to, which isn't happening anywhere in Nebula3 (unless you screw up the frame-shaders). I fixed the warning by adding a "UnbindD3D9Resources()" method to the D3D9RenderDevice, which is called at the end of a rendering pass, and before the D3D9 device is shut down. The method simply sets the texture stages, vertex buffer and index buffer of the device to NULL.
The most serious problem though was that Direct3D reported memory leaks when shutting down the application. Finding a D3D9 memory leak can be tricky, but thankfully Direct3D has a nice mechanism built into its debug runtime to find the allocation which causes the memory leak: on shutdown D3D writes a memleak log to debug-out, where each memory leak is given an unique id. The problem is, that even one forgotten Release() call can generate hundreds of memory leaks because of D3D's many internal dependencies. The most interesting leak however is usually the last reported, at the bottom of the leak-report. To find the offending allocation, open the DirectX Control Panel, go to the Direct3D 9 tab, and enter the last reported AllocID into the "Break On AllocID" field. Run the application in the debugger, and it should break at the allocation call in question. Turns out I forgot 3 Release() calls: one in D3D9RenderTarget::BeginPass() after obtaining the backbuffer surface from the D3D9 device, one in D3D9RenderTarget::EndPass() after a GetSurfaceLevel(), and after another GetSurfaceLevel() call in D3D9RenderDevice::SaveScreenShot().

The moral of the story: read the D3D9 docs carefully even for "uninteresting" functions, and run the application through the debug runtime after each change to the rendering code, to prevent such bugs from piling up to unmanageable levels.

I wanted to get this stuff fixed before the January SDK release, so this will come next week as time permits. Drakensang bugfixing and optimization has full priority for me at the moment.

Completely unrelated:
  • Watched Death Proof yesterday, and I was a little bit disappointed. The first half was outright boring, then 10 seconds with (probably) the most spectacular (and gory) car crash in movie history. The second half with the "new girls" was actually really good, but the car-chase at the end and the finale was quite a letdown as well, guess I was hoping to see Kurt Russel die in a more spectacular way hehe...
  • RezHD on XBLA is ... wow.

17 Jan 2008

First Render: Wii

Here's the first screenshot from the Nebula3 Wii port. There are still some rendering artefacts which we're looking into, but technically it's the first rendering which is fully running through the Nebula3 rendering pipeline on the Wii :)