Putting Money on the Screen


Jeff Tunnell is an awesome producer (among his other talents), and I’ve learned a lot from working with him. One of his frequent words of guidance while wearing his producer hat? “Put the money on the screen.”

There are a lot of ways to prioritize building software. What feature/change do you tackle first? Depending on the phase of the project you might want to focus on…

  • Vertical slice order. Get full functionality going in a small area first. In games this is usually one complete and playable level. In apps it might be the central interaction, like making a purchase or accessing a service. In websites it might be one complete page out of the entire site. Programmers like to think of this as depth-first traversal.
  • Horizontal slice order. Focus on roughing out broad areas before getting into the details. When I do this, my project usually ends up with a bunch of brightly colored screens with words like “STORE” and “MAIN MENU” and “WORLD MAP” on them. They don’t do much but you can quickly experience the flow of the app and adjust. In programmer land, this is a breadth first traversal.
  • Severity order. Also known as “QA driven development” – you ship builds off to be tested, get bugs ranked by severity, and tackle the most serious first. You end up with stable software, but functionality, usability, and polish can suffer depending on the criteria of evaluation. It’s also hard to really change the overall feel of the project when working in this mode.
  • Money on the Screen order. Maximize bang for buck. In other words, look for features that require small amounts of developer time but move the perceived value/quality of the project ahead substantially. The converse is also valuable – look for things that detract from perceived quality that are also easy to fix.

At different times, these are all valid paths. Early on, to flesh out your project, you probably want to do a horizontal or vertical slice. To ship a stable release, you might need to strictly follow severity order. Everything is contextual.

The magic of “money on the screen” is that it gets you thinking about your project from a fresh angle. What is the smallest change I can make to improve this product the most? It’s not the same as polish – you can have a very polished interaction that’s actively bad! Sometimes it’s the last little step to show off a ton of behind the scenes hard work.

grunts Here’s an example. When I was working on Grunts: Skirmish, a tower defense game I built at PushButton Labs, I did a ton of work on the unit behavior. Each on screen character ran a finite state machine driven by complex AI routines. An animation engine sequenced hundreds of pre-rendered 3D sprites. A data-driven component system tracked weapon status, health status, and updated the navmap. But despite all that the gameplay felt flat until I added little numbers that floated up when characters were damaged, healed, or upgraded. All of a sudden the complex behaviors behind the scenes were more visible to the users and it felt GOOD! Money on the screen!

MarbleBlastUltra_screenshotAnother example is Marble Blast Ultra, an XBox 360 launch title I helped build at GarageGames. We had a great (and for the time cutting edge) shader system but until very late in the project, our marbles were pretty lackluster in appearance. Similarly, we struggled with repetitive looking tiles on level geometry. Not good for the central element of the game! Small amounts of work enhancing the shaders in both cases dramatically improved the look of the game and really leveraged our investment in a new rendering architecture. Money on the screen!

loomdocs_searchA non-game example is the documentation for the Loom SDK. Actually writing the documentation and auto-generation scripts for the docs was a ton of work. But even with all that work, it wasn’t very useful. It was cumbersome to find the right page in the docs and generally the hard work we put in was wasted. Then we realized that the best way to get money on the screen was by adding a quick search box. Users could quickly see their available options, choose them, or even just try typing terms in to find a match. It only took a day or two of work and made the docs dramatically more useful. Money on the screen!

Every project is an investment, if only in time. Don’t you owe it to yourself to maximize your return? Too many times I’ve seen amazing systems that never put their full value front and center. From time to time, take a step back, look at what you’re doing and ask yourself – “How can I put money on the screen?” More often than you’d think, a small change can have a huge pay off.

Code Caving

Lately, we’ve been doing a lot of code caving at The Engine Co. Expert code explorers, like Fabien Sanglard, have an amazing ability to take very complex codebases and understand their core concepts. They can extend them and make them turn all kinds of tricks.

While you shouldn’t miss Fabien’s analysis of Quake 3, you don’t have to be an expert game developer working with major engines to gain huge benefits from a little code literacy. You are far more likely to work within the confines of a big codebase than you are to be working on a blue sky project with no dependencies or baggage.

Being able to dig through layers of code and understand complex existing systems is a huge benefit to your capabilities as a coder. So let me lay out my guidelines for becoming an awesome code caver able to dig through the muddiest of codebases to get things done. Starting with…

Look Deeper Than The Surface

“But wait,” you say, “I’m a JavaScript coder. My language is so simple and elegant! All I have to know is jQuery and I’m set.”

You poor, poor fool. What you aren’t seeing when you declare your very clever closure-based MVC microframework is the 5 million lines of C++ and 1500 man-years that went into the browser that is running it. The JIT that converts your template parser to machine code so it runs fast. The highly tuned GC that lets you kind of forget about memory management. And the vast cross platform rendering codebase that actually lets you draw “Hello World” and funny mustache pictures on screen.

And that browser sits on top of the operating system, which is ten times bigger in lines of code and years of effort. Now hopefully you don’t have to worry about the operating system. But suppose you have a neat little node.js server, and you run a neat little JS powered web service with it, and you get a neat little denial of service attack from some jerk in Eastern Europe. Now all of a sudden those unimportant details like how the operating system handles sockets become vastly important.

Tools of the Trade

How do you cut through a huge codebase like butter? How can you hone in on the insights that make you look like a genius? How can you do all this without breaking a sweat?

grep to grok

The first essential tool is grep. grep looks for a string in a set of files. There are many equivalents to grep (WinGrep, find-in-files in a favorite IDE, find on Windows/DOS), but the key is you can give it a folder or set of files to search and a regular expression to look for, and get back a list of results and the files that contained them. I like to use find-in-files in Sublime Text
for this because I can click search results and jump to them.

Grepping is super powerful. Suppose you get some weird error message – “Floorblop invalid.” – and nothing else from your app or service. You might be left with some questions. What’s a floorblop? Why is it invalid? Well, it might not matter. Just grep the error string and look at the surrounding code to determine a cause. Sometimes errors are generated dynamically so you might have to look for part of the string like ‘ invalid.”‘ if Floorblop was determined at runtime. With a little bit of cleverness, you can almost always reduce the number of potential sites for the error in the codebase to a manageable number – then just inspect the search results manually to find the place triggering the error.

Now you don’t care about the text of the error, but instead can start looking for functions that might be failing and causing the error to be emitted. In effect you can think of the error string as a unique identifier for some situation rather than a human-relevant string. (And in many codebases, error messages are cryptic at best.)

In the course of investigating your error message, you might encounter some variables or functions that seem to be related. Grep comes to the rescue once again: grep for them! Generally symbols are unique in a project, so you can quickly find all the code relating to a variable or function. You can also take advantage of the language’s syntax to help narrow down your results – for instance, in C++ function implementations in a class or namespace often look like this:

void Foo::bar()

So if you want to find the definition of bar, you can grep for “::bar”, or if it’s a global function, use a return type, ie, “void bar” and go right to the definition. Think about fragments that would match lines you want to see – you don’t have to get an exact match, just one that’s close enough you can find what you want quickly.

It should be obvious that these techniques can apply to any language, not just C/C++ – in Ruby you might use “def” in your search strings, in JS you might use “function”, in assembly, colons. The goal is to just filter down the number of results you have to consider, not take them to zero.

Binary Spelunking with strings

It’s easy to think of a binary as an opaque file. So it can be frustrating when you have a codebase with binaries in it. But fear not: DLLs, EXEs, .SOs, and a.out files have a ton of useful information in them. They have to – otherwise the operating system wouldn’t be able to run them!

One of the first tools to pull out is strings. Strings simply scans any binary file looking for bytes that look like they might be a string (displayable ascii characters ending with \0). When it finds a match, it prints it out. You can store the results in a file, or run them through grep or another tool, to help narrow results.

So suppose you have a function for which you cannot find the implementation – but you notice that the codebase includes a binary .so or DLL. You can check if the function is prepackaged in the binary by running strings and grepping for the name. If you get some hits there and nowhere else, you now have a good hint that you might not have all the source code.

Pulling Back the Hood: strace and wireshark

Maybe you’re getting a weird failure when interacting with your app. The first thing to do is establish exactly what’s happening! strace is a super powerful tool on Linux that dumps every system call a program makes. With this, you can see when and what is being read from disk, what network activity is happening, if it’s requesting memory from the system, and so on. All of a sudden a cryptic dynamic linker error will be easy to diagnose – you can see if libraries are being loaded and if symbols are being found or not. Or you can see what OS call is hanging the process. Super useful, if a bit tedious.

On Windows, SysInternals offers similar tools to inspect process activity.

Cross platform Wireshark is a packet sniffer that can really help you understand network activity. Suppose you are getting a cryptic error relating to some HTTP service. You might not be sure if the service is up, if you have a bad header, if the connection is being dropped… Wireshark will let you tell right away if data is making it to the server and what it is exactly, as well as what the server is sending back to you. It will identify things like corrupt TCP packets that are very rare failures but can totally break your app.

Once you’ve established the lowest levels are working properly, it’s easy to move up the stack to HTTP inspection tools like Charles or Telerik Fiddler. These can inspect HTTPS traffic and give you a higher level view of your app’s behavior. Chrome and Firefox also have built in tools that are similar.

Abusing Your Profiler and Debugger

You can also abuse profiling tools like Instruments to find out what an app is doing. On Windows, Windows Performance Analyzer and xperfview are the tools to try first. These tools allow you to attach to a running process on your system and see the callstacks and where time is spent in them.

In other words, they show you HOW the code is running in various situations and which are most important (due to being at top of call stack or being called often). It’s like a self-guided tour through an unknown codebase – how convenient!

You can also attach a debugger and use the time honored trigger of hitting pause a lot to see where time is spent during execution – or stepping through function calls to see what calls what. This is a bit more tedious but also very useful in understanding a large codebase.

Study the Fundamentals

The best way to learn to understand big codebases is to… wait for it… study big codebases. While every project is unique, after a while you start to internalize common layouts and development philosophies. As you learn to recognize patterns and see how bigger systems fit together, you gain building blocks that let you map out unfamiliar systems and build your own systems to higher standards.

A few books I’ve found educational in this area:

  • Operating Systems Design and Implementation – A walkthrough of Minix, a small self contained POSIX environment. This goes through every detail of a “pocket” operating system and explains it in detail. Great introduction for the next three books…
  • Mac OS Internals – A great guide to OS X’s structure with detailed examples and code walkthroughs. Understand the fundamentals of OS X while building valuable code analysis skills.
  • Windows Internals, Part 1 (6th Edition) – The same idea, but for Windows. A completely different take on OS design with a different set of constraints on development. In depth and fascinating.

Conclusions: Fighting Dirty & Getting to the Bottom

At the highest level of mastery, you should be prepared to reason about the behavior of any program from its highest levels of UI down to the behavior of the CPU. Every abstraction is leaky, and even though you want to preserve abstractions to keep development manageable, you shouldn’t let that limit your understanding. Get to the bottom of the stack first, then work your way back.

Don’t be afraid to fight dirty. Do you think something is important? Do you have an assumption about how some code works? Try breaking it to see how it fails. You’d be amazed how many problems can be solved by asking “Is it plugged in?”. Use version control or backups to make sure you can safely make dangerous changes and roll back to a safe state.

Finally – optimize for your sanity and development time. Don’t spend 12 hours when you can spend 1. Think about how you can narrow down your problem and check your assumptions quickly so you can focus on gaining knowledge and making the system do what you need. There are a universe of tools to make it possible to wrangle complex systems. Use them!

Some Notes on Mac Configuration

I was recently advising a fellow TEC-er on setting up their Mac, and they suggested I write it up. These are the tools I use daily for development! Here we go…

  1. Alfred is an awesome tool. Like the Windows key on Win8, you can hit a keyboard shortcut (cmd-space by default) and type an app name or other command. Super handy.
  2. iTerm2 is a replacement for the terminal. The best thing about OS X is that it has a great UNIX command line alongside a nice GUI, and iTerm2 makes the terminal even better. Don’t forget to configure and use the global terminal hotkey. So handy!
  3. Spectacles lets you control your windows with the keyboard. So handy, especially on my Air with limited screenspace.
  4. Sublime Text 3 is my go to text editor on every platform, and especially Mac. Favorite features include cmd-d (to select multiple matching text ranges and edit them simultaneously) and using the subl command line tool to open folders/files from the terminal.
  5. Oh My ZSH! is a great upgrade for your terminal. Maintaining your own custom config is even better… but as a default setup it’s pretty good.
  6. RVM is a good way to get assorted Rubies install on your system. Ruby is handy but it suffers from versioning hell. RVM can help, sometimes.
  7. The GitHub app is also mega handy. No replacement for command line but it simplifies auth and basic commit/branch switching.

There are also some miscellaneous things you probably need to do: get the latest git, possibly install brew and ports (but I find more and more I prefer to compile and install from source), get all the latest updates from the App Store, and install XCode 5 (from the App Store these days). You might need CMake for Loom builds, along with Android SDK and Android NDK. We like to use the HipChat app for communication amongst our team.

As you can see, I really like a keyboard oriented workflow. Perhaps it is the inevitable outcome of programming for so long. I’ve trended more and more towards it. I haven’t quite gotten to the point of using tmux and emacs in a fullscreen terminal, but who knows…

Loom at FITC Screens 2013 and Samsung Developer Conference

I will be speaking about Loom, Flash, and other interactive media topics at FITC SCREENS 2013! With glorious downtown Toronto as my backdrop, I’ll be discussing the Flash/AIR landscape, why we built Loom, and demoing how it’s a great open source platform on which for you to build apps AND games.  If you need to build cross platform interactive apps with native features, Loom is worth your time.

More about Screens:

SCREENS is dedicated to covering development for mobile devices and operating systems. Through two days of presentations, demonstrations and panel discussions, as well as an optional day of workshops, SCREENS will give you the know-how to address client needs (and demands!) as we move forward into the mobile future. Visit fitc.ca/screens for more information.

Screen Shot 2013-09-26 at 3.23.34 PM

I am also speaking at the Samsung Developer Conference Oct 27-29 in San Francisco. My session is a little bit up in the air but it will focus on mobile cross platform development, how to build neat tech stacks that leverage the native platform, and lessons learned in building mobile games on the platform.

More about Samsung Developer Conference:

Join us for the first annual Samsung Developers Conference to connect with industry visionaries, Samsung executives and technical leaders, and fellow developers. Get an exclusive first look at the latest tools, SDKs, and emerging platforms for Samsung devices to create what’s next.

That’s it for my October speaking plans. See you in Toronto and SF!

Deep Breath & Measure

“Your app is slow!”

What are you going to do about it? Is it that new code the intern landed? Is it the operating system’s fault? Misconfigured hardware? Or even, god forbid, a bad algorithm?

Your thoughts run to that bloated subsystem. You know the one. It’s complicated. You’ve had your eye on it for a while. It would be so sweet to refactor it. Take some of those algorithms down to O(log n) and add some caching. Get it running really sweet. All that code. It’s got to be it.

So you spend all day and a long night crunching on it. Getting it all working right. You spend some time tracking down bugs and adding a few previously unneeded features. You get it all working, check it in, and sleep till noon. You head back into the office to be greeted by… silence.

“Isn’t it faster now?” “Nope – still slow!”


What went wrong here?

One key got skipped: measuring. 

Your very first instinct whenever you see a performance issue should be to quantify it. The most powerful lines of code you can write are these:

var startTime = Platform.getTime();
// ... the code which is the problem area...
trace("Elapsed time: " + (Platform.getTime() - startTime));

I’ll often start at the root of my application and manually add blocks like these until I can identify the specific section which is problematic.

There are lots of fun variations on this idea – for instance, having a time threshold under which no output is printed to cut down on spam, or keeping track of averages/min/max, or even automatically profiling every function in your codebase (my game engine, Loom, does this – try profilerEnable and profilerDump in the Loom console!).

If you can’t nail the problem down to the point where you can measure it, it means you don’t understand the problem and you’re doomed to thrashing around with random chances until you fix it… or you just convinced yourself it’s better without changing anything. (You did all that work, didn’t you?)

The Psychological Barrier

Humans are actually VERY perceptive about fast changes in their environments. VR helmets induce motion sickness if they are more than a dozen milliseconds behind the head’s actual motion. Hearing is dependent on detecting changes in movement that are far less than a millisecond in duration.

So when you’re working on performance, eyeballing it should be plenty good, right?

Not true! People aren’t very good at remembering a specific arbitrary interval (say – 150ms vs. 100ms) and recalling it later. So when you’re optimizing load time and you shave 10% off of your 3 second load, you might not even notice that it’s faster. If you’re tired and grumpy, or just distracted, you might even think it’s slower!

It’s much easier and more reliable to just time it and keep some simple notes on your changes and what they did to your metric.

Deep Breath

The first thing to do when you hit a performance bump is to identify and measure it… then take a deep breath and think about what might be causing it. If you can narrow the problem down to a specific hot spot, you’re golden. (Assuming you can speed that part up – but that’s why we have fancy CS degrees, right? ;))

Now all you have to do is iterate. Try something. Try ten things. Measure after each one. Make hypotheses about what might be slow and try to remove it from the equation to see if it really IS your bottleneck. There’s a whole science here – check out my book on video game optimization for a full discussion – but once you are able to measure progress you will be able to move forward.

This is an area where I often see haste lead developers into wasted hours or days or increased technical debt – when a little care and patience would crack the problem right away! So remember to take a deep breath and measure BEFORE you code. :)


The Draw-Render-Update Conspiracy

I love clear terminology. It’s the engineer in me. This post is about some self-perpetuating terminology that confused me as a new developer.

The confusion centers around functions named draw() or renderFoo(). As a new developer, you get excited – “Ah hah!” you think to yourself, “I am about to see something cool go down here.” So you pop open the function, and it’s setting the x coordinate or changing a color on some other data structure. “WTF! There’s no drawing or rendering happening here!” you say, disgusted and confused. Or worse, you assume that changing those values has any immediate effect beyond altering a few bytes in memory – and ascribe magical behavior to that section of the code, corrupting your ability to understand and debug.

The reality, of course, is that the actual drawing – in terms of commands to change pixels on the screen – is happening elsewhere, deep in the bowels of some fanatically optimized inner loop (you hope). The draw() function is just updating some data that is used elsewhere. It’s not drawing anything, any more than telling your painter what color you want your bathroom is painting. That’s why I try to name them things like update().

But naming functions update() where the conspiracy gets ahold of me. I’ve already been disappointed by functions named draw and render, that don’t do either, for years. I’ve become cynical. “Well, everyone knows you don’t ACTUALLY render anything in your render function. It’s just a state update. It’s this code’s perspective on what rendering means.” And due to this etymological relativism, I write functions called renderToast that don’t really have anything to do with displaying charred bread products.

What’s the moral of this story? The trivial one is “name stuff accurately.” But the rot has already set in, and language is situational and metaphoric. So the real takeaway is for those who aren’t on the “inside” of the conspiracy, a warning – things aren’t always what they seem. Don’t trust that draw function until you really know what it is doing. Always read the whole codebase!

(And – OK, maybe it’s more of a stand alone complex than a conspiracy, if we want to be really specific.)

Flatten Your Conditionals!

Deep nesting is a pet peeve of mine. I’m going to show you what deeply nested code is and discuss some strategies for keeping things tidy. It’s my opinion that deep nesting is a sign of sloppy code.

You know, code like this (with my condolences to the author):

    if (productId != nil) {

        NSLog(@"EBPurchase requestProduct: %@", productId);

        if ([SKPaymentQueue canMakePayments]) {
            // Yes, In-App Purchase is enabled on this device.
            // Proceed to fetch available In-App Purchase items.

            // Initiate a product request of the Product ID.
            SKProductsRequest *prodRequest = [[SKProductsRequest alloc] initWithProductIdentifiers:[NSSet setWithObject:productId]];
            prodRequest.delegate = self;
            [prodRequest start];
            [prodRequest release];

            return YES;

        } else {
            // Notify user that In-App Purchase is Disabled.

            NSLog(@"EBPurchase requestProduct: IAP Disabled");

            return NO;

    } else {

        NSLog(@"EBPurchase requestProduct: productId = NIL");

        return NO;

This code is hard to understand. It’s hard to understand because error handling is distant from the error checks (for instance, the check for nil is at the beginning but the error and return are at the end!). It’s hard to understand because the important parts are deeply indented, giving you less headroom. If you want to add additional checks, it’s hard to know where to add them – and you have to touch lots of unrelated lines to change indent level. And there are many exit points scattered throughout. GROSS.

Whenever I see code like this I cringe. When I get the chance, I like to untangle it (or even catch it in code review). It’s soothing, simple work. To be sure, the functionality of the code is fine – it’s purely how it is written that annoys me.

There’s a key thing to be aware of in the structure of this code – it has a bunch of early outs related to error handling. This is a common pattern so it’s worth walking through the cleanup process. Let’s pull the first block out:

    if(productId == nil)
        NSLog(@"EBPurchase requestProduct: productId = NIL");
        return NO;

    NSLog(@"EBPurchase requestProduct: %@", productId);

    if ([SKPaymentQueue canMakePayments] == YES)
        // Initiate a product request of the Product ID.
        SKProductsRequest *prodRequest = [[SKProductsRequest alloc] initWithProductIdentifiers:[NSSet setWithObject:productId]];
        prodRequest.delegate = self;
        [prodRequest start];
        [prodRequest release];

        return YES;
        // Notify user that In-App Purchase is Disabled.
        NSLog(@"EBPurchase requestProduct: IAP Disabled");
        return NO;

    // Never get here.
    return NO;

It’s a LOT better, but now we have a return that can never be run. Some error handling code is still far from the error detecting code. So still a little messy. Let’s do the same cleanup again on the second block:

    if(productId == nil)
        NSLog(@"EBPurchase requestProduct: productId = NIL");
        return NO;

    NSLog(@"EBPurchase requestProduct: %@", productId);

    if ([SKPaymentQueue canMakePayments] == NO)
        // Notify user that In-App Purchase is Disabled.
        NSLog(@"EBPurchase requestProduct: IAP Disabled");
        return NO;

    // Initiate a product request of the Product ID.
    SKProductsRequest *prodRequest = [[SKProductsRequest alloc] initWithProductIdentifiers:[NSSet setWithObject:productId]];
    prodRequest.delegate = self;
    [prodRequest start];
    [prodRequest release];

    return YES;

See how much cleaner that is? Beyond saving indents, it also exposes the structure of the algorithm a great deal more clearly – check it out:

  1. Check for nil productId; bail if absent.
  2. Log productId if it is present.
  3. Check if we can make payments/IAP is active; bail if not.
  4. Submit the product info request.
  5. Return success!

The code and its “flowchart” now match up nicely, and if you modify one, it’s easy to identify the change in the other. This might seem like a little thing, but I find it shows that the purpose + structure of the function is well set up. And if you can’t write the function without violating this rule, it’s often a very solid clue you need to introduce some more abstraction – tactics such as breaking stuff up into helper methods, reorganizing your data structures a little bit, centralizing lookups/checks, and so on.

Something to keep in mind next time you find yourself hitting tab more than a couple times – flatten your conditionals!

Ludum Dare 26 & Loom

Are you a fan of Ludum Dare? I’ve loved watching it for a long time. The huge community of excited developers is fantastic to watch, and some great games come out every time. More than that, LD is a great opportunity. In fact, such a good opportunity that we’re giving LD participants a huge deal on Loom (but more on that later).

The incredible opportunity in an event like LD is that it gets you to finish something. It’s so common for projects to run on and on and on and on… Professionally, you could work in AAA games for a decade and only ship a few games. Imagine being a professional painter and only making 10 paintings in your whole career.

There are big lessons you only learn when you finish. Like – was the feature you spent 80% of your time working on what made the game fun, or was it the feature you added at the last minute on a lark that made the whole game work? Is your gameplay immediately understandable? How much is your fun driven by content vs. gameplay? What dumb things kept people from enjoying your game (like missing DLLs, unclear instructions, installer issues, and so on)? What REALLY goes into the last 20% it takes to ship?

You also get the big endorphin rush of releaseIt feels GOOD to ship. Even if you decide the project was a failure, completing it is good. You can put it on the shelf and refer to it later. And it’s motivating to know you’ve gotten something DONE and don’t have to think about it any longer.

It’s easy to get stuck in the doldrums of project creation. You end up going around and around creating new things on new tech. It’s shiny and in some ways fun, but you never experience the growth and maturation that comes from shipping and sharing your creation with the world. Shipping – even something small – gets you out of that rut.

Take some time and participate in Ludum Dare 26. Creating and finishing a small game project is one of the best investments you can make in yourself – not just as a game developer but as a professional. It’s easy to overlook how valuable this can be.

And of course – Loom is a great fit for making small games fast. Through LD26, use the code GO_LD26 to get 50% off all Loom subscriptions. Get Loom and go make something cool!

Loom is Launched!

Screen Shot 2013-02-28 at 11.59.20 PM


You may have wondered what I’ve been up to since PushButton Labs and PushButton Engine. Nate Beck, Josh Engebretson, and I are proud to share our latest creation, the Loom Game Engine, with the world. It’s a native mobile game engine with live reloading of code and assets, a great command line workflow, and a powerful AS3-like scripting language.

Check out this sweet video demoing Loom:

We’re giving away Loom Indie Licenses (normally $500) for FREE until Mar 29, the last day of GDC. We’ve already given away almost $2,000,000 in licenses. Get yours now!

TCP is the worst abstraction.

TCP is the worst abstraction.

You are hopefully familiar with Leaky Abstractions as described by Joel Spolsky. The idea is that when you add layers to hide messy details, you can mostly avoid having to know what exactly is going on – until something breaks. Think of it as putting a smooth plastic coating on your car. Everything is really simple and zero-maintenance until your engine breaks and now you’re peeling plastic back trying to figure out which part is on fire…

TCP makes some big promises. “Your data will magically arrive in order and on time!” “Don’t worry about it, I’ll retry for you.” “Sure – I can send any amount of data!” “Hahah, packet sizes? I’m sure we don’t have to worry about those.”

Let’s talk about springing leaks. Just like when your upstairs neighbor’s toilet springs a leak and you have to deal with the concrete realities that a high flow water source above your bedroom ceiling introduces, springing leaks means you can’t use your abstraction anymore – you now have to work with the underlying system, often at one remove (or more!) because you’re working through the abstraction you chose to shield you from this in the first place!

TCP is leaky as a sieve. TCP says “I’ll just act like a stream and send bytes to someone on the internet!” But here are just a few areas where TCP breaks:

  • If you send too much data at once (the OS buffer fills and the write fails; you then have to resend).
  • If you send too little data at a time (the OS will sometimes fix this for you, see Nagle’s Algorithm, which can be good or bad depending on when that data needs to go over the wire).
  • If you try to read too much data at once (again, the OS receive buffer has limited size – so you have to be able to read your data in chunks that fit inside that limit).
  • If you transfer data at the wrong rate (the TCP flow control rules can be a problem).
  • If you try to read too little data at a time (then OS call overhead dominates your transfer speeds).
  • If you want to assume data has arrived (it may not have, you have to peek and see how much data there is and only read if there is enough, which necessitates careful design of your protocol to accomodate this).
  • If you want to initialize a connection in a deterministic fashion. (You have to do a bunch of careful checks of domain/IP/etc. to make sure it will even go through and once the connection is initialized you have to figure out if it’s alive or not. It can also take quite a while to establish a connection and get data flowing, see efforts like SPDY)
  • If you are on a lossy network (it will incur arbitrary overhead resending lost data).
  • If you want to manage latency (you have to take care to send data in correct packet boundaries).
  • If you want to connect through a firewall (good luck with that one).
  • If you want to use nonblocking IO. (You have to do a bunch of platform specific crud and even then not all actions are nonblocking; you have to live in a separate thread and block there.)
  • If you want to run a popular service. (There are a lot of ways the OS can be tricked by outside parties into mismanaging its resources leading to starvation/denial of service attacks.)

IMHO, TCP is an abstraction in name only. If you want to get any kind of decent results from it, you have to fully understand the entire stack. So now not only do you have to know everything about TCP, you have to know everything (or at least most of it) about IP, about how the OS runs its networking stack, about what tricks routers and the internet will play on you, about how your protocol’s data is set up, and so on.

I came to networking in a roundabout way. I did a couple of small TCP projects in my teens, but I spent most of my formative programming years (18-23 or so) working with Torque, which uses the User Datagram Protocol (UDP). Here’s what UDP code looks like:

// Send a packet.
sendto(mysocket, data, dataLen, 0, &destAddress, sizeof(destAddress));
// Receive a packet.
recvfrom(mysocket, data, dataLen, 0, &fromAddress, sizeof(fromAddress));

It’s very very simple and it maps almost directly to what the Internet actually gives you, which is the ability to send and receive routed packets from peers. These packets aren’t guaranteed to arrive in order nor are they guaranteed to arrive at all. In general they won’t be corrupted but it would behoove you to check that, too.

This is primitive, like banging two rocks together! Why do this to yourself? Well – it depends. If you just need to create some basic networking behavior and don’t care if it’s subpar, TCP works well enough, and if you have to, you can get it to sing for certain situations. And sometimes TCP is required because of firewalls or other technical issues. But if you want to build something that is native for the network, and really works well, go with UDP. UDP is a flat abstraction. You have to take responsibility for the network’s behavior and handle packet loss and misdelivery. But by doing so you can skip leaky abstractions and take full advantage of what the network can do for you.

Sometimes it’s better to solve hard problems up front, rather than ignoring them and hoping they go away


Get every new post delivered to your Inbox.

%d bloggers like this: