Code Caving

Lately, we’ve been doing a lot of code caving at The Engine Co. Expert code explorers, like Fabien Sanglard, have an amazing ability to take very complex codebases and understand their core concepts. They can extend them and make them turn all kinds of tricks.

While you shouldn’t miss Fabien’s analysis of Quake 3, you don’t have to be an expert game developer working with major engines to gain huge benefits from a little code literacy. You are far more likely to work within the confines of a big codebase than you are to be working on a blue sky project with no dependencies or baggage.

Being able to dig through layers of code and understand complex existing systems is a huge benefit to your capabilities as a coder. So let me lay out my guidelines for becoming an awesome code caver able to dig through the muddiest of codebases to get things done. Starting with…

Look Deeper Than The Surface

“But wait,” you say, “I’m a JavaScript coder. My language is so simple and elegant! All I have to know is jQuery and I’m set.”

You poor, poor fool. What you aren’t seeing when you declare your very clever closure-based MVC microframework is the 5 million lines of C++ and 1500 man-years that went into the browser that is running it. The JIT that converts your template parser to machine code so it runs fast. The highly tuned GC that lets you kind of forget about memory management. And the vast cross platform rendering codebase that actually lets you draw “Hello World” and funny mustache pictures on screen.

And that browser sits on top of the operating system, which is ten times bigger in lines of code and years of effort. Now hopefully you don’t have to worry about the operating system. But suppose you have a neat little node.js server, and you run a neat little JS powered web service with it, and you get a neat little denial of service attack from some jerk in Eastern Europe. Now all of a sudden those unimportant details like how the operating system handles sockets become vastly important.

Tools of the Trade

How do you cut through a huge codebase like butter? How can you hone in on the insights that make you look like a genius? How can you do all this without breaking a sweat?

grep to grok

The first essential tool is grep. grep looks for a string in a set of files. There are many equivalents to grep (WinGrep, find-in-files in a favorite IDE, find on Windows/DOS), but the key is you can give it a folder or set of files to search and a regular expression to look for, and get back a list of results and the files that contained them. I like to use find-in-files in Sublime Text
for this because I can click search results and jump to them.

Grepping is super powerful. Suppose you get some weird error message – “Floorblop invalid.” – and nothing else from your app or service. You might be left with some questions. What’s a floorblop? Why is it invalid? Well, it might not matter. Just grep the error string and look at the surrounding code to determine a cause. Sometimes errors are generated dynamically so you might have to look for part of the string like ‘ invalid.”‘ if Floorblop was determined at runtime. With a little bit of cleverness, you can almost always reduce the number of potential sites for the error in the codebase to a manageable number – then just inspect the search results manually to find the place triggering the error.

Now you don’t care about the text of the error, but instead can start looking for functions that might be failing and causing the error to be emitted. In effect you can think of the error string as a unique identifier for some situation rather than a human-relevant string. (And in many codebases, error messages are cryptic at best.)

In the course of investigating your error message, you might encounter some variables or functions that seem to be related. Grep comes to the rescue once again: grep for them! Generally symbols are unique in a project, so you can quickly find all the code relating to a variable or function. You can also take advantage of the language’s syntax to help narrow down your results – for instance, in C++ function implementations in a class or namespace often look like this:

void Foo::bar()

So if you want to find the definition of bar, you can grep for “::bar”, or if it’s a global function, use a return type, ie, “void bar” and go right to the definition. Think about fragments that would match lines you want to see – you don’t have to get an exact match, just one that’s close enough you can find what you want quickly.

It should be obvious that these techniques can apply to any language, not just C/C++ – in Ruby you might use “def” in your search strings, in JS you might use “function”, in assembly, colons. The goal is to just filter down the number of results you have to consider, not take them to zero.

Binary Spelunking with strings

It’s easy to think of a binary as an opaque file. So it can be frustrating when you have a codebase with binaries in it. But fear not: DLLs, EXEs, .SOs, and a.out files have a ton of useful information in them. They have to – otherwise the operating system wouldn’t be able to run them!

One of the first tools to pull out is strings. Strings simply scans any binary file looking for bytes that look like they might be a string (displayable ascii characters ending with \0). When it finds a match, it prints it out. You can store the results in a file, or run them through grep or another tool, to help narrow results.

So suppose you have a function for which you cannot find the implementation – but you notice that the codebase includes a binary .so or DLL. You can check if the function is prepackaged in the binary by running strings and grepping for the name. If you get some hits there and nowhere else, you now have a good hint that you might not have all the source code.

Pulling Back the Hood: strace and wireshark

Maybe you’re getting a weird failure when interacting with your app. The first thing to do is establish exactly what’s happening! strace is a super powerful tool on Linux that dumps every system call a program makes. With this, you can see when and what is being read from disk, what network activity is happening, if it’s requesting memory from the system, and so on. All of a sudden a cryptic dynamic linker error will be easy to diagnose – you can see if libraries are being loaded and if symbols are being found or not. Or you can see what OS call is hanging the process. Super useful, if a bit tedious.

On Windows, SysInternals offers similar tools to inspect process activity.

Cross platform Wireshark is a packet sniffer that can really help you understand network activity. Suppose you are getting a cryptic error relating to some HTTP service. You might not be sure if the service is up, if you have a bad header, if the connection is being dropped… Wireshark will let you tell right away if data is making it to the server and what it is exactly, as well as what the server is sending back to you. It will identify things like corrupt TCP packets that are very rare failures but can totally break your app.

Once you’ve established the lowest levels are working properly, it’s easy to move up the stack to HTTP inspection tools like Charles or Telerik Fiddler. These can inspect HTTPS traffic and give you a higher level view of your app’s behavior. Chrome and Firefox also have built in tools that are similar.

Abusing Your Profiler and Debugger

You can also abuse profiling tools like Instruments to find out what an app is doing. On Windows, Windows Performance Analyzer and xperfview are the tools to try first. These tools allow you to attach to a running process on your system and see the callstacks and where time is spent in them.

In other words, they show you HOW the code is running in various situations and which are most important (due to being at top of call stack or being called often). It’s like a self-guided tour through an unknown codebase – how convenient!

You can also attach a debugger and use the time honored trigger of hitting pause a lot to see where time is spent during execution – or stepping through function calls to see what calls what. This is a bit more tedious but also very useful in understanding a large codebase.

Study the Fundamentals

The best way to learn to understand big codebases is to… wait for it… study big codebases. While every project is unique, after a while you start to internalize common layouts and development philosophies. As you learn to recognize patterns and see how bigger systems fit together, you gain building blocks that let you map out unfamiliar systems and build your own systems to higher standards.

A few books I’ve found educational in this area:

  • Operating Systems Design and Implementation – A walkthrough of Minix, a small self contained POSIX environment. This goes through every detail of a “pocket” operating system and explains it in detail. Great introduction for the next three books…
  • Mac OS Internals – A great guide to OS X’s structure with detailed examples and code walkthroughs. Understand the fundamentals of OS X while building valuable code analysis skills.
  • Windows Internals, Part 1 (6th Edition) – The same idea, but for Windows. A completely different take on OS design with a different set of constraints on development. In depth and fascinating.

Conclusions: Fighting Dirty & Getting to the Bottom

At the highest level of mastery, you should be prepared to reason about the behavior of any program from its highest levels of UI down to the behavior of the CPU. Every abstraction is leaky, and even though you want to preserve abstractions to keep development manageable, you shouldn’t let that limit your understanding. Get to the bottom of the stack first, then work your way back.

Don’t be afraid to fight dirty. Do you think something is important? Do you have an assumption about how some code works? Try breaking it to see how it fails. You’d be amazed how many problems can be solved by asking “Is it plugged in?”. Use version control or backups to make sure you can safely make dangerous changes and roll back to a safe state.

Finally – optimize for your sanity and development time. Don’t spend 12 hours when you can spend 1. Think about how you can narrow down your problem and check your assumptions quickly so you can focus on gaining knowledge and making the system do what you need. There are a universe of tools to make it possible to wrangle complex systems. Use them!

Flatten Your Conditionals!

Deep nesting is a pet peeve of mine. I’m going to show you what deeply nested code is and discuss some strategies for keeping things tidy. It’s my opinion that deep nesting is a sign of sloppy code.

You know, code like this (with my condolences to the author):

    if (productId != nil) {

        NSLog(@"EBPurchase requestProduct: %@", productId);

        if ([SKPaymentQueue canMakePayments]) {
            // Yes, In-App Purchase is enabled on this device.
            // Proceed to fetch available In-App Purchase items.

            // Initiate a product request of the Product ID.
            SKProductsRequest *prodRequest = [[SKProductsRequest alloc] initWithProductIdentifiers:[NSSet setWithObject:productId]];
            prodRequest.delegate = self;
            [prodRequest start];
            [prodRequest release];

            return YES;

        } else {
            // Notify user that In-App Purchase is Disabled.

            NSLog(@"EBPurchase requestProduct: IAP Disabled");

            return NO;

    } else {

        NSLog(@"EBPurchase requestProduct: productId = NIL");

        return NO;

This code is hard to understand. It’s hard to understand because error handling is distant from the error checks (for instance, the check for nil is at the beginning but the error and return are at the end!). It’s hard to understand because the important parts are deeply indented, giving you less headroom. If you want to add additional checks, it’s hard to know where to add them – and you have to touch lots of unrelated lines to change indent level. And there are many exit points scattered throughout. GROSS.

Whenever I see code like this I cringe. When I get the chance, I like to untangle it (or even catch it in code review). It’s soothing, simple work. To be sure, the functionality of the code is fine – it’s purely how it is written that annoys me.

There’s a key thing to be aware of in the structure of this code – it has a bunch of early outs related to error handling. This is a common pattern so it’s worth walking through the cleanup process. Let’s pull the first block out:

    if(productId == nil)
        NSLog(@"EBPurchase requestProduct: productId = NIL");
        return NO;

    NSLog(@"EBPurchase requestProduct: %@", productId);

    if ([SKPaymentQueue canMakePayments] == YES)
        // Initiate a product request of the Product ID.
        SKProductsRequest *prodRequest = [[SKProductsRequest alloc] initWithProductIdentifiers:[NSSet setWithObject:productId]];
        prodRequest.delegate = self;
        [prodRequest start];
        [prodRequest release];

        return YES;
        // Notify user that In-App Purchase is Disabled.
        NSLog(@"EBPurchase requestProduct: IAP Disabled");
        return NO;

    // Never get here.
    return NO;

It’s a LOT better, but now we have a return that can never be run. Some error handling code is still far from the error detecting code. So still a little messy. Let’s do the same cleanup again on the second block:

    if(productId == nil)
        NSLog(@"EBPurchase requestProduct: productId = NIL");
        return NO;

    NSLog(@"EBPurchase requestProduct: %@", productId);

    if ([SKPaymentQueue canMakePayments] == NO)
        // Notify user that In-App Purchase is Disabled.
        NSLog(@"EBPurchase requestProduct: IAP Disabled");
        return NO;

    // Initiate a product request of the Product ID.
    SKProductsRequest *prodRequest = [[SKProductsRequest alloc] initWithProductIdentifiers:[NSSet setWithObject:productId]];
    prodRequest.delegate = self;
    [prodRequest start];
    [prodRequest release];

    return YES;

See how much cleaner that is? Beyond saving indents, it also exposes the structure of the algorithm a great deal more clearly – check it out:

  1. Check for nil productId; bail if absent.
  2. Log productId if it is present.
  3. Check if we can make payments/IAP is active; bail if not.
  4. Submit the product info request.
  5. Return success!

The code and its “flowchart” now match up nicely, and if you modify one, it’s easy to identify the change in the other. This might seem like a little thing, but I find it shows that the purpose + structure of the function is well set up. And if you can’t write the function without violating this rule, it’s often a very solid clue you need to introduce some more abstraction – tactics such as breaking stuff up into helper methods, reorganizing your data structures a little bit, centralizing lookups/checks, and so on.

Something to keep in mind next time you find yourself hitting tab more than a couple times – flatten your conditionals!

I Wrote A Book: Video Game Optimization

More precisely, Eric Preisz and I wrote a book!

The book is called Video Game Optimization, and it covers everything you need to know to get maximum performance from any software project – but especially games. If you’re struggling with getting a great framerate out of your game, I highly recommend checking it out. 😉

Video Game Optimization goes all the way from high-level concepts like planning for performance in your project’s timeline, to determining which broad area of your system is a bottleneck, down to specific tips and tricks for optimizing for space, time, and interactivity. Based on the course that Eric Preisz taught at Full Sail University on optimization, it isn’t the only book you’d ever want to read on the subject, but we think it is a great introduction!

The journey from that initial conversation where our mutual friend Jay Moore introduced us and suggested I would be a good co-author, to the day when we finished and shipped the book was a long but rewarding trek. Eric moved across the country from Florida to Nevada, as he moved from teaching at Full Sail University to running the Tech and Tools group at InstantAction. He also became a father with the arrival of his son, Grant. I left after 5 years at GarageGames and helped build a new company, PushButton Labs.

A lot has changed while we wrote it, but it still felt really good to arrive at GDC, visit the book store outside the exhibition hall, and finding a big stack of Video Game Optimization sitting front and center. 🙂

3D in Flash 10 & Git


I spent a little time with Flash 10’s 3d features recently. Since Flash 10.1 is imminent and FP10 has been at 90%+ penetration for a while now, it’s probably safe to start looking at using FP10 stuff in my projects. 🙂

I also used this as an opportunity to try out git. It was easy to get git installed on OSX (I used the command line version, installed from git-osx-installer) and put my code up on Github. You can browse my test code at

My main concern was the transformation pipeline – I think there might be some benefits to using 3d positions for the rendering pipeline in PBE. So I wanted to do a brief survey of the built-in 3d capabilities, then look more closely at the transformation routines.

My first test was making a DisplayObject rotate in 3d (minimal source code). It runs well, and if you turn on redraw region display, you can see that it’s properly calculating the parts of the screen that need to be modified.

This was easy to write, but it revealed the primary flaws with the built-in Flash 3d capabilities. First, look closely – the edges of the shape were sharp, but the interior detail is aliased. This is because whenever the 3D rendering path is engaged, it turns on cacheAsBitmap. This is fine for simple scenarios (say, taking a single element in a UI and giving it a nice transition) but not for more complex situations.

Which brings us to the second and bigger flaw. I added a thousand simple particle sprites at different 3d positions (source code). This runs extremely slowly because of an issue described by Keith Peters involving nested 3d transforms. Nested 3d objects cause excessive bitmap caching, dramatically reducing performance. You might end up with bitmap-caching in action on every 3d object and every DisplayObject containing 3d objects.

In addition, because it’s cached, moving objects further/closer from the camera results in an upsampled/downsampled image. So you tend to get mediocre visual results if your objects move much.

My next step was to stop using the 3d capabilities of DisplayObjects, and just position them in x/y based on their 3D position (source code, notice it is two files now). This gave a massive performance gain. At low quality, 1000 particles runs at 1440×700 at acceptable FPS. Most of the overhead is in the Flash renderer, not the code to update DisplayObject positions, but it still takes a while to do all the transformation, and it causes a lot of pressure on the garbage collector from all the 1000 temporary Vector3D instances that are created every frame. (600kb/second or so – not insignificant.)

Next I figured it would be helpful to make my camera move around (sample code).

This required that I understand the coordinate space all this operated in. What are the coordinate spaces? According to the official docs, screen XY maps to world XY. So forward is Z+, up is Y-, right is X+. Once I figured that out, I had to prepare a worldMatrix with the transform of the camera, then append the projectionMatrix. The PerspectiveProjection class always seems to assume screen coordinate (0,0) is the center of the projection so you will have to manually offset. Maybe I was not using the projection right, since the docs imply otherwise.

There were two other details to sort out. First, I had to reject objects behind the camera, and second, I had to scale objects correctly so they appeared to have perspective. The solution revolved around the same information – the pre-projection Z. By hiding all objects with Z < 0 and scaling by focalLength / preZ, I was able to get it to behave properly.

Next up is Matrix3D.transformVector… which is slow. Transforming 1000 vectors eats 3ms in release build! This is really slow in absolute terms (Ralph Hauwert has a good example of the same functionality running much much faster). I didn’t really want to introduce Alchemy for this project. But we can use AS3 code that avoids the allocations, saving us GC overhead and getting us an incremental improvement in performance.

Andre Michelle has some interesting thoughts on the problem of temporary objects related to transformations (see I did notice that Utils3D.projectVectors had some options for avoiding allocations, but it did not seem to run significantly faster (even in release build). (sample code for using projectVectors)

In the end, I settled on my own implementation of transformVectors, as it seemed to give the best balance between performance and ease of us. There’s a final version of the sample app where you can toggle between transformVector and the AS3 version by commenting out line 105/106 up on github, so you can test it for yourself. The transform function took some effort to get right, so here it is to save you the pain of implementing it yourself. It transform i by m and stores it in o.

        final public function transformVec(m:Matrix3D, i:Vector3D, o:Vector3D):void
            const x:Number = i.x, y:Number = i.y, z:Number = i.z;
            const d:Vector. = m.rawData;

            o.x = x * d[0] + y * d[4] + z * d[8] + d[12];
            o.y = x * d[1] + y * d[5] + z * d[9] + d[13];
            o.z = x * d[2] + y * d[6] + z * d[10] + d[14];
            o.w = x * d[3] + y * d[7] + z * d[11] + d[15];

Time for some conclusions. I think that the 3D capabilities built into DisplayObject are OK, but very focused on light-weight graphic design use. Building a significant 3D application requires you write your own rendering code built on Flash’s 2D capabilities (either DisplayObjects or drawTriangles and friends). The 3d math classes are ok, but immature. Some things are very handy (like the prepend/append versions of all the methods on Matrix3D), but the tendency for Flash APIs to implicitly allocate temporary objects limits the usefulness of some the most central API calls. In addition, important assumptions like the order of the values in Matrix3D.rawData were not documented, leading to frustrating trial and error. I am excited to see Flash’s 3d capabilities mature.

Tweaking your game with Google Spreadsheets

tweaksheet Our latest game, Grunts: Skirmish, has 200 tweakable parameters. There are 9 player units with three levels of upgrade, and another 9 enemy units. Each unit has between three and ten parameters that can be altered.

We tried a few approaches – hand-editing a large XML file (but it was too large and spread out) and an in-game tweaking UI (but it was too much work to get the UI to be friendly to use). The old standby of having the designer and artist file bug reports to have the programmer update the game wasn’t getting us very far, either.

“Well,” says I, “We’re some sort of Web 2.0 startup, right? And we’re developing a Flash game aren’t we? And Flash can talk to websites, can’t it? And don’t we use Google Docs for everything?”

It turns out there is an XML feed from public Google spreadsheets. And ActionScript 3 supports E4X, so you can directly manipulate XML without any extra work. Now we tweak our game using a shared spreadsheet up on Google Docs.

I wrote a parser for their format:

// Extract the entries. It's namespaced, so deal with that.
var xmlns:Namespace = new Namespace("xmlns", "");

// Parse into a dictionary.
var cellDictionary:Dictionary = new Dictionary();
for each(var entryXML:XML in tweakXML.xmlns::entry)
   cellDictionary[entryXML.xmlns::title.toString()] = entryXML.xmlns::content.toString();

And wrote a quick component that would fetch the spreadsheet feed, parse it, and stuff it into the right places on named objects or template data. Now I have a little entry in our level file that looks like:

<object id="googleTweaker">
   <component class="com.pblabs.debug.GoogleSpreadsheetTweaker">
        <!--  Grunt Level 1 tweaks -->

Each line maps a cell in the spreadsheet to a property on a template or active game object. Some properties have to be set several places, which the system deals with automatically.

The biggest wrinkle was Google’s crossdomain.xml policy. Basically they do not allow random Flash apps to access their site. So I had to write a small proxy script, which sits on our development server next to the game and fetches the data for it. Figuring out I had to do this took more time than any other step.

The main difference between the snippet and the full code is the version in our repository is 220 lines long. I only have around 150 of the full set of 200 parameters hooked up, but after a hard afternoon’s work, the process for tweaking the game has become:

  1. Open Google Docs.
  2. Edit a clearly labeled value – like level 1 grunt health.
  3. Restart the game, which is running in another tab in your browser.

This takes you about a minute between trials. Not too bad. Before this, the process was:

  1. Get the game source from SVN.
  2. Find the right XML file – there are several.
  3. Find the right section in the XML – altogether we have 200kb of the stuff for Grunts!
  4. Change the value.
  5. Commit the change.
  6. Wait 5-15 minutes for the build system to refresh the live version of the game.

Ten minutes per tweak is not a good way to develop.

I’ve heard about developers using Excel spreadsheets for tweaking, but can’t find anything about using Google Docs to do it. But Google Spreadsheet is obviously a better choice. It has built-in revision tracking. You can edit it simultaneously with someone else. You can access live data in XML either publicly (like we did) or privately via their authentication API. It’s absolutely worth the half-day of your time it will take to add Google Spreadsheet-based tweaking to your game – even if it’s a non-Flash game, downloading and parsing XML is pretty easy with the right libraries.

I strongly suspect this feature will find its way into the next beta of the PushButton Engine. Which, by the way, you should sign up for if you are interested in developing Flash games. We’re bringing people in from the signup form starting this week. If you want more information, or just like looking at cool websites, click below to check out the new version of the PBEngine site, which has a bunch of information on the tech. Tim did an awesome job on the site design.


Edit: Patrick over on the GG forums asked about the proxy script. It’s actually ludicrously simple. Not very secure either so I wouldn’t recommend deploying it on a public server. I got my script from a post on Aden Forshaw’s blog. In the real world you would want to have some security token to limit access to your proxy script… but since this is for tweaking a game that is in development I didn’t sweat it.

Tip: Setting Up Flex Builder The Sane Way

Flex Logo

There is a right and a wrong way to set up Flex Builder. The wrong way is to get the Flex Builder package from the Adobe site. It’s running on a super old version of Eclipse, and it lacks a lot of useful editors and functionality. I have lost many man-hours of productivity to this version of Flex, which is why I am writing this post.

What you want to do is this: Go get the latest, vanilla Java-IDE version of Eclipse. You can do this at the Eclipse downloads page, the package called “Eclipse IDE For Java Developers.” (Not Java SE, unless you want an extra hundred megs of Java tools.)

Now, go to the Adobe Flex download page, and go to download it – but scroll waaay down and get the plug-in instead.

Finally, install Eclipse – then install the plugin. Now you have a fully up to date, ready to go version of Flex, as well as a bunch of nice tools that come with Eclipse for Java. In my experience, this is significantly more reliable and responsive than the all-in-one build from Adobe.

Edit: Don’t forget to do Help -> Search for Flex Builder Updates and Help -> Software Updates… to make sure that you are fully up to date!

Working with the Apple Remote on Boot Camp

Apple Remote Maybe you want to listen to ALL the events that come in off the Apple Remote. I don’t know how to do that under OS X, but here’s how to do it under Windows with Boot Camp:

  1. Open up the System icon from the Control Panel.
  2. Choose the Hardware tab.
  3. Click on the Device Manager button.
  4. Expand “Human Interface Devices” in the tree view.
  5. Right click on “Apple IR Receiver” and choose Update Drivers…
  6. Choose “Install from a list or specific location” and click next.
  7. Choose “Don’t search.” and click next.
  8. You’ll see a list with several options. Choose “USB Human Interface Device” and hit Next.
  9. Continue clicking “Next” to get through the wizard. Eventually you’ll be asked to reboot, after the drivers are installed. Do so.
  10. Reboot. Now you will get complete and unfiltered HID events from the IR Receiver. In addition, +/- and Menu will no longer change your volume or launch iTunes.

At this point you could grab a program like EventGhost, and use the HID plugin (at the bottom of the list of plugins) to listen to events from the remote and do whatever you like with them! Be sure to uncheck “Trigger enduring events for buttons” and check “Use raw Data as event name” or you won’t see any events.


%d bloggers like this: