Musings on Game Engine Design

GDC 2008

with one comment

This year was the tenth anniversary of my first GDC.  It’s milestones like this that make one pause and take stock of one’s life.  Over the last decade, one of my college classmates was appointed United States Attorney for South Carolina; another lost a limb fighting in Iraq; another married actor Steve Martin (yes, that Steve Martin).  And I… I have spent most of my waking hours contributing, in my own small way, to the perpetual adolescence of the American male.

Like a career in game development, GDC is a mix of small rewards and great frustrations–and yet the two somehow balance each other out.  Back in 1998, the attendees were a small brash crowd, excited that gaming had finally arrived and enthusiastic about the future.  I think that was the first time somebody announced the oft-repeated canard about how we’re bigger than the movie industry.  My friends and I all made the crawl from room to room on Suite Night, availing ourselves of the free drinks and the catered food from the warming trays.  There was a party on the Queen Mary.  I got a thick pile of t-shirts and an Intel graphics card, and thought, 3dfx better watch out now that Intel’s getting into the market.

Now GDC has become like SIGGRAPH.  The crowds are enormous–nearly 15,000 in attendance this year, I’m told.  But it’s not the size of the crowds that makes it less intimate.  It’s the fact that so many of the people there don’t really have much to say to each other:  there are indie game developers, console game developers, serious game developers, mobile phone game developers, people selling middleware and hardware and outsourcing to all of the above, recruiters, wannabes, publishers trying to sign developers, developers looking for a publisher, HR folks looking to hire artists and programmers and musicians, and press trying to cover the whole spectacle.  The death of E3 meant that this year there were more sessions than ever that could be summed up as, “look at my game and how awesome it is.”  I tried to avoid those.  I spent my three days at GDC looking for quiet places to talk to the people who do the same thing I do.  I didn’t go to any of the parties.

GDC hasn’t managed to scale to handle its growth.  Internet connectivity was poor throughout the convention center–even worse than last year.  The food is as identically terrible as ever.  And this year, for the first time, proceedings didn’t become available at all until after the conference, and those in the form of a slim handful of PowerPoint decks.  The clever people have taken to bringing cameras to the talks and snapping pictures of each slide that goes up, which is distracting but understandable.  The state of the GDC proceedings is an embarrassment to the industry, but it’s one that I don’t expect to change unless CMP stops handing out Gigapasses to speakers who show up without their slides.

My main interest this year was learning how other developers are dealing with problems of scale.  The team I’m working on at Day 1 has gone from three people to eighty people in the last three years.  Our engine just passed a million lines of code.  We’ve had to turn our resource installers into multi-volume installers because they crossed the two-gig threshold that’s the largest file size Inno Setup supports.  I think that we’re dealing with our scale better than GDC is, but it’s always good to gather additional perspectives.  I’ll talk more about the things that we’re doing and what we can learn from the presenters at GDC as I go.

Running Halo 3 Without a Hard Drive

Part of the problem of scale in next-generation console development is that console memory is getting larger and content sizes are growing to match, but DVDs and memory aren’t getting a whole lot faster.  That means that we developers have to work harder and harder to keep players fed with fresh content.

Mat Noguchi’s talk on streaming content in Halo 3 was interesting mostly because it showed how Bungie approached the problem of streaming from the opposite end from what we’ve done at Day 1, but we seem to be ending up in similar places.  They took existing level technology and shoehorned streaming into BSP regions.  We built levels up out of smaller streamable primitives.  Bungie has always emphasized separation of file I/O (deciding what gets loaded) and resource management (managing data once it’s been loaded), and they do lots of off-line computation to determine what resources actually need to be loaded for each level.  We allow any asynchronous resource load to trigger other asynchronous resources loads, and don’t return requested resources to the main game thread until the entire requested resource tree is complete.  In short, their system was designed for the most efficient player experience and ours was designed to be as flexible as possible while we’re iterating on content creation.

In the end, though, we’re doing similar optimizations to get to where we need to be to ship.  We’re sorting resources in the order they’ll be read to minimize DVD seeks, we’re grouping resources that have similar level ownership, and we’re compressing groups that will all be read together.  Like Bungie, we’re having problems with audio resources interacting badly with everything else that needs to be streamed–they solved that problem by just not playing any streamed AI voices if the player doesn’t have a hard drive to cache to.

Bungie seemed to give half the talks at GDC this year, and they were even kind enough to make their slides available–though not on the GDC Proceedings site.

Automating Regression Discovery: Finding the Wrenches in the GEARS OF WAR

I’m the senior engineer at Day 1 most interested in build infrastructure and automated testing, so I always like to hear what other teams are doing in that department.  Epic’s test and metrics infrastructure isn’t quite up to Microsoft standards, but they do as well as any independent developer I’ve heard of.  They use CruiseControl as a continuous integration framework, building all configurations of the Unreal Engine across ten dedicated build machines every time a developer checks in.  They also do continuous integration for content, building fresh cooked resources every time an artist or designer checks in.

The resource build process collects performance and automated test data for each level.  Any crashes get logged to a database.  Mesh and texture size information gets dumped to spreadsheets covering the resource usage of each level.  The system collects performance data from fly-through camera paths, bot matches, human playthroughs, and render sample points scattered around each level.  Their performance gathering for UT3 definitely benefited from the fact that it was a bot-based multiplayer game, but their performance gathering across the board sounds second-to-none.

One of the gems I took away from this talk was the oh-yeah-that’s-obvious point that collecting performance data about average framerate really isn’t very useful.  On the consoles, we’re always V-synced at 60, 30, 20, 15, 10 or 5 Hz anyway–usually 30 or 20.  A more useful statistic to collect is the percent of frames, during a playthrough of any level, that are below 30 Hz.  It’s also useful to collect the percent below 5 or 10 Hz to catch dramatic stalls.  If you have more than a couple of percent below 30 Hz, and if you have any frames below 10 Hz, then something needs to be done to improve performance.

Sharing Code Roundtable

I attended the sharing code roundtable partly because we’re sharing the same engine across two projects at Day 1, and partly because I interviewed with the moderator, Brian Sharp, at Ion Storm, and I think he’s a smart guy and I was interested in what he had to say.  In the end, though, I found the session frustrating because no two people in the room even had problems similar enough to be able to have a “this worked for me, what worked for you?” conversation.  We mostly ended up talking past each other.  A few of the forty-or-so attendees:

  • Me:  Working on one big-budget console game, sharing an engine with another game being developed in another Perforce branch, and integrating back and forth every week.
  • Brian Sharp:  Organizing an effort to share code across all Midway studios… his challenge was trying to improve communication between different studios.
  • A guy from EA Canada:  EAC is like a middleware provider to the other EA studios, and as a dedicated middleware provider, he wanted to know how, after giving another studio a drop of some technology, EAC could merge back fixes and new features with a minimum of pain.
  • A guy from academia:  He was working on a distributed development project with undergraduate students in different countries, and wanted to know what the rest of us thought of distributed source control systems like Mercurial and Git.  I’m pretty sure most of us had never heard of them.

I think Brian did a good job moderating, but it just seemed like the wrong group of forty people to have a conversation.  Or maybe it was just the wrong conversation for forty people to have.

I did have one startled and happy moment when the conversation tangentially touched on automated testing.  I had a similar moment in the Gears of War infrastructure talk when Martin Sweitzer talked about Epic’s use of PC-Lint.  They said:  We have all this existing legacy code and it doesn’t have unit tests and Lint generates a million warnings for it.  How can we test the new code we write without having to touch the ugly legacy stuff that we know works and that we’re afraid to change?  Day 1’s in-house Despair Engine is three years old, and it was written with automated unit tests running for every library and a nightly Linting of the entire engine from the very beginning.  We pass all our tests.  We don’t generate any Lint warnings–at least not the ones we care about.  Modern console development is really, really hard.  It’s nice, sometimes, to hear about the problems you don’t have.

Ray Kurzweil

I missed the Microsoft keynote on Wednesday for a meeting, but I was happy to be able to make Ray Kurzweil’s keynote on Thursday.  Kurzweil’s an inventor and futurist, and the author of The Age of Spiritual Machines and The Singularity is Near:  When Humans Transcend Biology.  He’s what we used to call an extropian back while I was in college.  His talk covered much of the same material that he covered in The Age of Spiritual Machines (and presumably The Singularity is Near, though I haven’t read that one):  innovation and intelligence are increasing at an exponential pace, and in the near-future we’re going to cross a super-evolutionary threshold so sudden and so steep that we have no ability to predict the future of humanity beyond it.

All of this has a shrill ring of millenarianism to it that is easily dismissed as the “rapture of the nerds”, but I’ve always been willing to forgive the extropians their evident wackiness because they’re probably right:  human science has advanced more in the last hundred years than in the thousand before that, and while Kurzweil sounds a little optimistic to me–I don’t expect to ascend to nerdvanna in the next twenty years–it’s hard for me to believe that in a hundred years people will still be dying of old age because we still haven’t figured out how the genes for aging work.

So here’s the techno-optimism:  In the last 40 years, Kurzweil says, computing power has increased a billion times and computer size has decreased 100,000 times.  The same will happen again over the next 25 years to produce injectable computers the size of blood cells that can enter your brain and feed your senses a full 3D virtual reality.

Or those injectable computers can rewire your body chemistry.  The software running our bodies is obsolete, evolved for a hunter-gatherer lifestyle tens of thousands of years ago.  Your body stores fat because our genes evolved in a time of scarcity.  But now we live in a time of abundance.  RNA interference is on the verge of being able to turn off genes for storing fat.  Current diet drugs work by suppressing appetite, which, Kurzweil notes, “is like birth control that works by inhibiting interest in sex.”

Our ability to predict the future also evolved in a different time, when exponential progress was so slow that it was indistinguishable from linear progress because the exponential curve was so flat.  Now people make stupid predictions that Social Security will go bankrupt and that greenhouse gasses will melt the ice caps even though technologies are coming that will soon render all those assumptions obsolete.  Within five years, Kurzweil says, nano-engineered solar panels will be more cost-effective than fossil fuels.  Within twenty years, we’ll have largely replaced fossil fuels as a means of power production.

And so on.  Slides here.  If you’re skeptical about his claim that by 2010 desktop computers will disappear and wearable ubiquitous computing devices will project high-definition augmented-reality images directly into our retinas, so am I.  But I want to believe.  Kurzweil has a way of making you feel like we’re living in the future right now.

Life on the Bungie Farm

For Halo 3, Bungie had a build farm of 180 computers with 300 processors.  These ran in-house client/server software written in C# that communicated through atomic transactions in a SQL database to manage

  • code builds and automated tests
  • lightmap rendering
  • content builds
  • other tasks (cubemap rendering, shader compilation, builds of, and maintenance)

It’s a neat set-up.  I wish that we had a half-million dollars to spend on build hardware, but honestly I’m not sure what we’d do with that many computers.  The majority of Bungie’s workload was lightmap rendering.  I’ve read elsewhere that they were cooking spherical harmonic coefficients for reflection functions into their lightmaps, similar to Half Life 2’s orthogonal lightmap basis, which I suppose helps explain why their lightmaps were so much more expensive than, say lightmap generation in Quake I.

But Day 1’s games are all about environmental destruction, which means that we really can’t get away with static lighting.  All lighting in both our current games is dynamic.  I wouldn’t mind having, say, twenty build machines to throw at code and resource builds, but I think that would take care of our needs just fine.

That said, there are definitely some features of Bungie’s set-up that make me envious.  A C# client/server model communicating through SQL would be a step up from our PHP build scripts communicating through text files.  And Bungie uploads all symbols from their automated builds to a symbol server, as we do, but they also copy all source to a fixed location on the network and link the executable to the location of the source with which it was built using the undocumented-and-not-mentioned-anywhere-else /SOURCEMAP linker option.  We’ll probably make those changes to our build system in the next few months.

Internal & Outsourcer Management of Tools & Pipelines

I’m not really interested in the process of outsourcing, which is more of an art and production function, but I’m very interested in how tools get built and distributed to the people who use them.  Day 1 has about 140 developers working on two different games now.  We build Inno Setup installers for all our tools every night as part of our automated build, they get tested by our QA department every morning, and if there aren’t any showstopper bugs then we update the latest release version in a SQL database.  A little system-tray app that we run turns yellow on everyone’s desktop and prompts them to right-click and select “Update” to install latest.

Volition has even larger teams than we do:  100 internal developers and 20 outsourcers on Saints Row; 101 internal developers and 44 outsourcers on Saints Row 2; and 80 internal developers and 52 outsourcers on Red Faction.  Like us, they’ve invested in some automatic distribution infrastructure.  They have an app called vInstaller that runs MSBuild scripts to do tool installs.  I think that I prefer having a proper installer, since it means that our tools can be uninstalled through the Add/Remove Programs option in the Windows Control Panel.  Volition devs have to uninstall through vInstaller, which apparently makes Windows Vista unhappy.  Their IT guys apparently had to disable User Account Control on all their Vista machines.

They did say that on one occasion they’d had the misfortune of auto-deploying a virus to all developers.  We’ve been lucky enough to avoid that one so far.

They integrated an art outsourcing company in Shanghai into their existing tool distribution framework, which created new challenges.  The outsourcers didn’t use Perforce.  Users weren’t admins on their own machines.  And they tended to install tools into many different locations, which broke non-robust tools.

We generate a slightly trimmed-down installer for outsourcers of just the tools that they need to preview assets in our engine.  Here again, I think that we benefit from having a traditional Windows installer instead of a standalone installation app.  I’m don’t think a user even needs to be an admin to run the installer and I’m pretty confident that our tools will work whatever path you install them to.  We usually install to C:\Program Files, and we don’t expect to access any other tools or data through relative paths.

Slides for Volition’s talk here.

Unreal Tournament 3 Postmortem

I went to the UT3 postmortem with some hesitation, because Epic has a bit of a rep for turning every talk into a sales pitch for the Unreal Engine.  In the event, I was very pleasantly surprised by the candor with which Jeff Morris and Mike Capps talked about UT3’s successes and shortcomings.

The talk followed the now-traditional What Went Right/What Went Wrong format of every post-mortem in Game Developer magazine, but like most developers, Jeff and Mike seemed to find more passion in talking about what went wrong than what went right:

  • They wasted a lot of effort on a character customization system that had no effect on actual gameplay.  With characters moving at 30 MPH in front of often-animated backgrounds, you couldn’t tell what they looked like.
  • The single-player campaign gameplay came together too late to be polished or balanced.  “Head and shoulders above any UT game we had made in the past, but it wasn’t Bioshock,” Mike said.  Of course, Bioshock is an Unreal Engine game.
  • The game didn’t have enough polish time overall, and releases were scheduled badly.  They diluted their marketing efforts by releasing at different times on different platforms and in different regions.  And they walked into the most competitive season ever for first-person shooters, going up against Bioshock, Call of Duty 4 and Halo 3.  “We’re very big in Russia,” Mike observed sardonically.

The Best Talk I Missed

I like to listen to GDC Audio proceedings while I take long road trips, and Ubisoft Creative Director Clint Hocking showed up on my radar when I heard his excellent talk on Narrative in the Splinter Cell Trilogy and his robust showing at the Game Design Challenge at GDC 2005.  I’d already noticed as I played the Splinter Cell games that all the ones that came out of Ubisoft Montreal were fantastic, and the ones that came out of Ubisoft Shanghai were… not so much.  After hearing Clint’s GDC 2005 talks, I suspected that he might be one of the big reasons for that.

Unfortunately, I had conflicts that kept me from being able to make it to Clint’s talk at GDC 2008, I-fi:  Immersive Fidelity in Game Design, but he’s been generous enough to make his slides and an actual paper available on his blog.  He discusses player immersion in terms both critical and creative.  I’m a proponent of more scientific game design:  I believe that designers should form credible hypotheses regarding what is and isn’t fun and create experiments that test those hypotheses.  I think that application of the scientific method is the only way to consistently create more involving games.  I’m not sure that I-fi is even quite a hypothesis yet, but it’s close, and it suggests avenues of advance that can only make the games we create more compelling.

Looking Ahead

All in all, there was some really good material at GDC this year.  I felt like the overall tone of presentation was somewhat muted, though.  Game developers are a typically cocky bunch, but scaling up to handle hundred-man teams and multi-core development has been a challenge, and I think it’s left a lot of us feeling less like we know The Way to solve these issues.  Everybody’s trying, and the companies that are still standing are clearly doing better than the many companies that aren’t, but even the survivors are a little shell-shocked.

And we have two years until the next console generation comes along and shakes everything up again, multiplying memory and core counts by a factor of eight and team sizes by a factor of two.  “If you’re planning a project today that’s going to take more than six months,” Ray Kurzweil said, “then the pace of change is so great that you really need to take exponential growth into account.”  Our industry has been living with that reality for a while.

Written by Kyle

March 12th, 2008 at 8:51 am

Posted in GDC