Tips for Flash Developers Looking at Hardware 3D

Adobe has announced 3D support in Flash. A huge step forward for the Flash development community! Tremendous things will be possible. If you haven’t seen it already, here’s a great video of what’s possible:

The demo at Adobe MAX 2010 was impressive, and it was a huge treat to see my friends at Mythos Lab and Alternativa get featured on the big screen during the keynote. 🙂

But let’s get back to reality. This is nothing that hasn’t been out on other platforms for years. Why is it worth talking about? Why does it matter when your PC can do way more?

It’s the audience. When Flash 3D drops, you will be able to create 3d content and bring it to all those billion people who have Flash Player installed. And I am sure that with Adobe’s focus on mobile, they will look towards mobile support, too.

If the model for AAA games like God Of War and Mass Effect is big budget Hollywood movies, then the Flash space is television. There are lots of short-form products (5-10 minutes avg playtime) produced by tiny and often independent teams (like much of what’s on Kongregate or Newgrounds). But there are also longer form products, updated weekly, that have huge distribution. In the TV space they call them series; online they’re called social games. The numbers are similar, too. Farmville gets 56M monthly players currently. How many people watch NCIS or another hit TV show in a month?

TV is probably still bigger for now, but social games monetize better per user. 🙂

So – 3d in Flash will have a big impact. And a lot of developers will be dragged kicking and screaming into the brave new world. I wanted to share a few lessons that I learned in my previous life as a 3d middleware developer at GarageGames. Some are big, some are small, but hopefully all are things that will save you, a Flash developer looking at this new hardware 3D, time and pain. They took me a while to learn.

Framerate and Performance

Don’t use frame rate as a measure of performance. It’s useless. All you really want is to target a fixed framerate of 30Hz, 60Hz, or 120Hz. Here’s how that goes down, by the way: your project will run at 120Hz until the artists get their hands on it. Then you will decide that 60Hz is a great target – looks good, feels good. But when it comes down to the deadline, you’ll hit some bumps and settle for 30Hz. Lucky is the project that can ship with 60Hz, and the only guys I’ve ever seen go out with 120Hz are doing VR, where you have to have it to avoid nausea.

What about variable framerate? It’s bad. Don’t go there. Humans are very good at compensating for fixed delays, and very good at detecting variable delays. It’s much better to cap to a fixed low framerate, because users will automatically compensate and cease to notice it. (This is how our nervous system works; your arms have a fixed delay for nerve signals to propagate, muscles to react, etc. So the capability is built into the brain at a very low level.) However, if it varies, the user will notice every little change in performance and be frustrated.

Back to the performance measurement issue, frame rates are not linearly comparable. What you want is milliseconds per frame (mspf). How come? Because Hz is not linear. It is harder to get from 90Hz from 60Hz than it is to get from 30Hz to 60Hz. How come? Looks like the same diff, right? Let’s put the same problem in mspf. Going from 30Hz to 60Hz means going from 32mspf to 16mspf. But going from 60Hz to 90Hz means going from 16mspf to 11mspf. In the first case I am cutting 16ms off my frame time, which is a big chunk, but I still end up with a relatively generous 16mspf. In the second case, I have to squeeze 5 mspf from an already meager frame budget.

Using mspf also makes it easy to discuss performance gains. If I have a task that takes 4mspf (say applying a full screen blur), and I can optimize it down to 3mspf, I know exactly how much of a win I’ll get no matter how the rest of the application is performing. Compare that to saying that the blur runs at 250Hz then I have optimized it to 333Hz. And when I am budgeting performance I can easily divvy up my available frametime to specific tasks. We might have 8mspf for physics, 4mspf for render calls, and 4mspf for the GPU to finish.

Bottom line: use milliseconds per frame not hertz for performance measurement and budgeting.

GPU Readbacks

(This advice and the advice in the following sections is general; it applies to OGL/D3D and console HW APIs, and I doubt Molehill can deviate from the fundamental hardware reality. Of course, Molehill’s software renderer probably has different performance characteristics – slower, but more permissive.)

Never ever read back from a GPU resource. Ever. Don’t. Stop it. Ok – you can maybe do this ONCE per frame, if you are careful and build your renderer around it.

Why is readback so incredibly bad? Because it forces synchronization (as do a few other things in the graphics API). Then, while the GPU is stopped, you do a slow memory read back to the CPU, which is also stopped and waiting for the data.

Normally, the GPU will run ahead of the CPU – you fire off some draw commands and it goes and does its magic. But imagine you issue these commands:

Clear framebuffer.
Draw triangles.
Read back from framebuffer.

Normally, you’d never see the cost of #2 on the CPU – it would immediately return and you’d go on your way. Say it takes 1ms to do that draw. The call returns in a few microseconds and you keep doing. But when you issue command 3, the CPU has to wait for the GPU to finish command 2. Then, because framebuffers are often stored in proprietary formats either for performance or quality reasons, the GPU has to prepare it for readback. Then it can finally stream the requested data back to the CPU, and subsequently continue on with its rendering. All this cost shows up as you waiting a long time on #3.

Readbacks are tempting to the Flash developer because copying stuff out of the Flash vector renderer is basically free. But on hardware, it gets you a triple whammy – you stop the world, you make the GPU do a bunch of extra work to prepare data for readback, and you have a data copy operation. Don’t do it.

Let the GPU Run Free

Ideally, you want to issue the minimum number of calls to the 3d API that will let the GPU do maximum useful work without interruption. GPUs are like bullet trains. They go really, really fast as long as they don’t have to stop or make right turns. Sometimes it is even better to do somewhat wasteful things because the gain from continuous operation is so big. GPUs are on track to have thousands of parallel processing cores. Just let ’em run and you’ll be amazed at what they can do!

Compare this to the Flash vector renderer which is best when it is given minimal work to do. It’s a great renderer but it doesn’t have the benefit of powerful dedicated hardware that has been tuned for decades to run it fast.

If you have a big batch of geometry, it’s often best to simply draw it. You will want to figure out what the threshold is at which culling is beneficial, but it can often be on the order of tens of thousands of triangles.

Vertex shading is practically free when you’re looking at rendering 1080p at 60Hz. You’re going to be hitting 124M pixels every frame at MINIMUM – if you have any overdraw it can easily hit 300M or higher. If you have a million triangles in your scene, you’re only going to be doing 60M triangles in that same time frame. So take advantage of it when you can.

Don’t let the GPU run too far ahead, though – it introduces control latency. GPU drivers can often buffer several frames ahead, and if you aren’t careful you can add 50-100ms of latency between a user hitting a button and something happening. Introducing a small readback at end of frame can help force synchronization. (Molehill might deal with this for you, too.)

Performance Trade-offs & Maintaining Your Image

On the GPU, you have huge opportunities to trade space for time for quality. Lookup textures can replace costly math functions (or make them tweakable by artists). You can sacrifice precision (either using half precision or just being a little bit wrong) for performance, too, to get big wins. The great thing about graphics programming is that it only needs to LOOK right, not BE right. So pile on the hacks – if it fits your game’s look, it’s a good solution.

That’s enough on performance. I could write a whole book on optimizing 3d game performance. In fact, I have; it’s called Video Game Optimization. It covers the whole problem from scheduling for performance to measuring to identifying bottlenecks to fixing them. It even has a chapter on Flash optimization!

Stand on the Shoulders of Giants

Graphics are sexy and well studied. The fount of graphics research is SIGGRAPH (any further back than that and you’re tracking individual researchers). Literally every rendering problem you will encounter when working with Molehill has a solution that was published at Siggraph thirty years ago, played with at SGI twenty years ago, considered at 3DFX 15 years ago, and brought into the mainstream and shipped on a console or PC game 5-10 years ago. Around then it was also republished as a practical implementation in a Gems book, and also shipped as a card demo by ATI, nVidia, or Intel. If it’s really good, Carmack or Abrash prototyped it when they were working on Quake 1, Epic has it in Unreal, Crytek has some great demo videos of it, or Valve has published slides on it.

Am I claiming that there’s no fundamental research left in 3d graphics, or no expressivity? Of course not. But you owe it to yourself to be familiar with existing work in the field before you go reinventing the wheel, and the field is aggressively researched. So I would not expect to stumble on any huge break through right away, just like if you pick up oil painting you’re probably not going to reinvent the field…

I think the interesting part of graphics is finding the best fit for your specific problem and enabling artists. It’s a craft. Look at what Pixar does. Their research has always been in support of their story and their specific technology needs. Out of that they have ended up contributing a lot of great stuff, and innovating in a lot of ways, but they did it with a strong understanding of existing technology.

I encourage you to do the same. It is what I have done in my own projects and it has worked out well. When I have pursued tech for tech’s sake the results have always been less effective – good for a demo but ultimately ineffective in enabling my team to succeed.

It’s the Assets, Dummy

How do you make a good-looking game? Really, it’s not the technology. Shaders, complex rendering techniques, and so forth are all fine and they can enable good results. But great art is what carries a project.

Look at that video at the top of the article (it’s OK; I’ll wait). There are very few complex rendering techniques at work. Mostly, it is great art (and a great lightmapper) with just a few simple shaders and effects.

Look at Mass Effect. They use shaders, sure, but the real source of the game’s look is the tireless work of many talented artists.

Or look at one of our older projects, Grunts: Skirmish:

All the runtime is doing is compositing unrotated, unscaled bitmaps. But the great work of our artist, Tim Aste, makes it a memorable and interesting visual.

To that end it’s vital that you make it easy for artists to work with your technology. There are two lessons I have learned in this are:

Make it easy to see. Have an automatically updated live build somewhere and let the artists have the URL. That way they can check art in and see how it looks in the game, or tweak things as needed, without involving a programmer.
Make it easy to process art. Make sure that the inputs to your art pipeline are high-quality uncompressed files. Then process them into whatever format and quality level you require. Optimize the final format for very fast download and unpacking.
Your art needs and formats will change. Be able, at any time, to reprocess all of your source assets into their final form. Get everyone used to the system early on and you will have the agility you need late in the project to succeed. There’s nothing worse than asking your artists to hand-process dozens or hundreds of assets due to some technical change! (It’s not just unpleasant, it also introduces mistakes, since they are people, not machines.)

Art pipelines and processing is a subject worth of a whole bookshelf. Go out and learn as you need it. Not every project needs an all-encompassing solution, but… well, read the next section. 🙂

Trends & Closing Thoughts

3D changes the landscape of application development. Flash has tracked the console space’s evolution, on a much shorter timeframe. Looking at how consoles were affected by the introduction of 3D is very instructive:

Art becomes more complex. Creating an animated character in the SNES days was the work of a single person for a few days. Creating a character for Mass Effect is months of work for a team.
Asset size increases. A full SNES game is 16MB or so. 40 hours of gameplay, tons of content, music, visuals, cutscenes, etc. A typical XBox 360 title is 6gb, often compressed, and now 2 disc games are coming out. An asset footprint of 15gb is not unrealistic. It will be interesting to see how this plays out in the online space.
Budgets and complexity swell. I bet you could make a complete, high quality clone of A Link to the Past for two month’s operating budget of Mass Effect or Halo. Teams will get bigger, projects will become more complex, and technical requirements will increase.
Gameplay remains relevant. The hit titles in the console space are those with high technical quality – but they are also fun. Mass Effect is highly immersive. Halo has highly polished gameplay. God of War is a remarkable experience in an industry full of remarkable experiences. And indie titles that are genuinely fun and engaging – like Minecraft – are able to succeed without large marketing or art budgets.

Right now, Flash is at the same place the PC space was in 1998-1999. We can do compelling 2D or limited 3D content and teams are getting larger but not yet huge. Molehill will bump us forward to around 2001-2002, perhaps a bit further. It’s an exciting time to be a Flash developer.

The kinds of experiences we will be able to build are going to undergo a quantum leap, as will the demands on us as developers. We are in a unique position to leverage the lessons of the past to build the games of the future. What do you think the major issues will be? How can we build great content on this new feature set?

12 responses to “Tips for Flash Developers Looking at Hardware 3D”

orion elenzil

November 13, 2010 at 3:46 pm

great article, Ben.

peeps new to 3D – take heed! Ben knows.

i’ve been fooling around with framebuffer reads on the iPad
and was a bit surprised to find that the performance hit
seems to be proportional to the number of pixels being read.
eg, reading the whole screen takes maybe 300ms or a bit more (untenable),
while reading a 16×16 rectangle is negligable.
sascha/hdrs

November 3, 2010 at 6:03 am

Hey Ben, thanks for the technical explanation!

I found ME’s story extremely shallow but that might be just me. I like it when stories have plot twists that make you go WTH or AHAAA but ME was so predictable. 😉

Either way … really looking forward to Alternativa3D v8! Interesting times ahead for us .. uhmm .. Flash-using Game devs!
sascha/hdrs

November 2, 2010 at 10:11 pm

Very interesting read about the intricacies of the GPU rendering process! Maybe it would be useful to explain what GPU Readbacks actually are, for the devs who have never touched GPU programming yet.

In the end what I’d add is that if the game relies on story, don’t forget to care about creating an interesting story. Mass Effect is indeed comparable to a Hollywood Blockbuster, both in shininess and in flatness of story! 😉
1. Ben Garney
  
  November 2, 2010 at 11:23 pm
  
  Thanks!
  
  A readback is when you read data that is stored on the GPU. For instance, copying data from the backbuffer, a vertex buffer, or an index buffer. Try to be write only.
  
  I dunno – I really like Mass Effect 2’s story. 🙂 But of course there are lots of ways to tell great stories in games, not all of them involving a team of writers…
Matthew Fabb

November 2, 2010 at 7:49 am

From what I’ve read the interesting thing that Adobe is doing is creating a very low-level API and getting involved with the various groups to build 3D engines on top of that. So it seems that it’s groups like Away3D and Alternativa that will optimize how to use the GPU and then developers will build on top of that with a high-level API. If the CPU based 3D Flash engines of the past are any indication, each engine will have different strengths and weaknesses.
Dave Young

November 2, 2010 at 6:24 am

An interesting read Ben.. I appreciate the level of detail in this post, it’s a mini-gem. I got a couple of major take-aways from this about framerate and CPU/GPU usage that might end up affecting the way I am designing a game.
Phil Peron

November 1, 2010 at 5:53 pm

“When I have pursued tech for tech’s sake the results have always been less effective – good for a demo but ultimately ineffective in enabling my team to succeed.”

So true! And well stated. Thanks so much for the writeup, Ben. Great stuff.
1. Ben Garney
  
  November 1, 2010 at 10:34 pm
  
  Thanks, Phil! 🙂
swingpants

November 1, 2010 at 4:30 pm

Great read, thanks for the post Ben. Just about to buy your book =-)
1. Ben Garney
  
  November 1, 2010 at 10:34 pm
  
  Awesome! I hope you like it! 🙂
reelfernandes

November 1, 2010 at 3:50 pm

Interesting read. I’m looking forward to seeing what the Flash 3D workflow is like, using a modeling app like Modo.
1. Ben Garney
  
  November 1, 2010 at 10:33 pm
  
  The art pipeline is a big piece that Adobe hasn’t said much about yet.