Game Object Structure: Scene Graphs
By Kyle Wilson
Monday, July 22, 2002Last time around, I discussed the benefits and costs of using a static class hierarchy for game objects. Such use of inheritance by game objects is an example of compile-time hierarchy. There are two other kinds of inheritance hierarchies that are of interest in game object design, however.
The first is load-time hierarchy. Load-time hierarchy is used by engines that provide data inheritance. The Dungeon Siege engine, for example, allows designers to define game object "classes" in an XML-like specification language. A game object "class" can inherit properties from a parent class. At load time, however, the inheritance hierarchy is flattened. Regardless of how a game object is described in the specification language, it's represented by only a single C++ object, which is not part of either a compile-time or run-time object hierarchy.
Any object-oriented scripting language that supports inheritance could similarly be considered an example of load-time hierarchy. As game engines grow more powerful, the line between data and language blurs.
Load-time hierarchy is interesting, and perhaps I'll write about it more later. For now, though, I'm concerned with the third kind of inheritance, run-time hierarchy. Specifically, I'm concerned with the use of run-time hierarchy as used to represent a 3D scene in a scene graph. In developing a 3D engine, one of the first decisions you'll have to make is whether or not to keep your game objects in a scene graph.
Objects in a Scene Graph
In a scene graph, each node contains a transformation matrix and, possibly, a renderable object. As the tree is traversed for rendering, transformation matrices are concatenated and objects are rendered with the resulting transform. Transformation matrices may be animated by AI, triggers, and so forth.
At Cyan, the Plasma engine used a scene graph modeled after the scene hierarchy in 3D Studio Max. I suspect that NDL uses a similar system in NetImmerse, since the three original architects of Plasma came from NDL. Radon Labs, the developers of Urban Assault, use a scene graph modeled after Maya for their engine, The Nebula Device. If you're going to use a scene graph in your game engine, you can do a lot worse than to base its design on one of the standard 3D modeling packages. First, the designers of Max and Maya probably put a lot more time into designing their systems than you can afford to do, and their designs have stood the test of time over many releases. Second, it's always beneficial to have the data structures in your game mirror the data structures in your tools, and have the data structures in your tools in turn mirror the representations of that data which are exposed to the artists and designers. Otherwise, things tend to get muddled.
Unfortunately, the needs of a game engine aren't the same as the needs of a 3D modeling package. At Cyan, we had problems because the nodes in our scene graph served two purposes: they were game objects, and they were the parents of game objects. It wasn't clear when triggering flags for visibility and animation and the like whether changes were applicable only to the current node or to the whole hierarchy rooted at a given node. We would have been better off making the separation made by SGI's Performer or MultiGen. In their hierarchies, only leaf nodes are renderable. Internal nodes contain transform information and a list of children. (The Gang of Four make a similar separation between leaf an internal nodes in their Composite pattern[1].) At Cyan, we awkwardly triggered functionality that Performer and MultiGen encapsulate neatly in switch nodes and LOD nodes. Switch nodes and LOD nodes render a subset of their children based on scripted conditions or viewing distance.
If you're willing to get even further away from traditional scene graph representations, you can increase the power of a scene graph even more. Using a directed acyclic graph instead of a tree, you can share objects and animation state across multiple locations in your world. You can also concatenate additional state with transformation matrices during tree traversal. For example, you could specify materials on internal nodes. Or, at a higher level, you could use effects nodes. You could insert a "flame node" into the hierarchy which would indicate to the renderer that all its children should be rendered with a fire shader. The more exotic your hierarchy gets, of course, the more custom tools you need to support it and the more difficult those tools become to write. And the more difficult everything gets to debug.
Moving Beyond Scene Graphs
There are two problems with using a scene graph to store game objects. The first is that 3D graphics technology is rapidly advancing, and not all those advances fit well with scene graphs. In a hierarchical animation, you'd traditionally store a mesh at each node in the hierarchy. If you're trying to render one skinned creature, though, you'd rather render the union of the meshes once with a matrix palette taken from the concatenated node transforms. You can do this in a scene graph, of course, but it becomes an awkward special case, where one node "side-indexes" into a sibling hierarchy. You'll have the same problem animating control points for higher-order primitives. Height fields, lights, volume fog and shadows are similarly likely to end up being awkward special cases.
Those are problems with using scene graphs in a graphics engine. A more serious problem is that a game engine has different requirements from a graphics engine. A joint in an animated hierarchy is not likely to map well to a game object. It may be useful to load or delete individual game objects on demand. This is awkward with parts of a hierarchy. A game object may have AI, inventory, skills and statistics. What owns this information in a scene-graph-based game object system? The root node of a bipedal hierarchy?
One particular difference between a graphics engine and a game engine merits further discussion. Any game engine is a simulation of a particular reality, with its own rules. A scene graph traditionally describes a particular physical universe in which every object moves rigidly in the space defined by its parent. The scene graph thus acts outside the scope of a graphics engine and acts as a particularly limited physics engine, in which all constraints are absolute. As long as all your objects are supposed to be fixed in the space of their parents, this is more efficient than running repeated iterations of stiff spring models to get the proper object motion. If you're remaking Kings Quest in 3D and all your animations are pre-scripted, a scene graph is perfect. If you're modeling shock absorbing springs for a racing simulation, though, it's just going to be painful and confusing trying to map the forces and motion for wheels back into parent/car space.
So what's the alternative to scene graphs, then? You can simply keep your game objects in a list, of course. We did that at Interactive Magic and at iROCK. Any animation hierarchies were encapsulated down in the renderable object that each game object owned. This worked well for the game objects, but had some shortcomings. Essentially, the problem was that the hierarchical transforms were hidden down in the 3D graphics code. That meant that we couldn't do any kind of accurate collision with skinned creatures, even if we had bounding boxes for every bone, because the physics system didn't know where the bones were.
Attachment is also rather painful in a scene list system. If your character game object picks up a sword game object... where's the sword? How does it know where the character's hand is? (That's the problem with hidden transform information again.) How does the sword make sure that the hand position has updated when the sword position updates? And if there really is a sword game object that's getting a transform matrix from the character game object every frame, aren't we really just back to a scene graph system? Maybe the sword game object should just go away, and just pass its renderable/model/mesh information off to the character game object, but that's not a very generalizable solution. And what if you want to drop the sword again?
You may notice that the further I get into this article, the more questions I have, and the fewer answers.
At this point, I think any game object system essentially has to answer three questions:
Where do I attach the AI controller? How do I do skinned animation? How do I make a character object pick up a sword object?Moving into the realm of guesswork, I think the best answer to these questions is to have a transform hierarchy, rather than a scene hierarchy. Game objects are what you'd logically think of as game objects -- a character or a sword, not a torso or a forearm. Their renderables, and their physics objects, index into the transform hierarchy as needed. And when you pick up a sword, I think the sword game object transfers/creates some components into the character game object and then it deletes itself from the scene. I'm worried about the debugability of such a system -- the more you separate out functionality, the harder it is to figure out where you are and what belongs to what as you step through the code. But at least its some kind of answer to those essential questions.
But then how do you drop a sword?
[1] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.
I'm Kyle Wilson. I've worked in the game industry since I got out of grad school in 1997. Any opinions expressed herein are in no way representative of those of my employers.