3D in Flash 10 & Git

picture-1

I spent a little time with Flash 10’s 3d features recently. Since Flash 10.1 is imminent and FP10 has been at 90%+ penetration for a while now, it’s probably safe to start looking at using FP10 stuff in my projects. 🙂

I also used this as an opportunity to try out git. It was easy to get git installed on OSX (I used the command line version, installed from git-osx-installer) and put my code up on Github. You can browse my test code at http://github.com/bengarney/garney-experiments/tree/master/exploringFlash3D/.

My main concern was the transformation pipeline – I think there might be some benefits to using 3d positions for the rendering pipeline in PBE. So I wanted to do a brief survey of the built-in 3d capabilities, then look more closely at the transformation routines.

My first test was making a DisplayObject rotate in 3d (minimal source code). It runs well, and if you turn on redraw region display, you can see that it’s properly calculating the parts of the screen that need to be modified.

This was easy to write, but it revealed the primary flaws with the built-in Flash 3d capabilities. First, look closely – the edges of the shape were sharp, but the interior detail is aliased. This is because whenever the 3D rendering path is engaged, it turns on cacheAsBitmap. This is fine for simple scenarios (say, taking a single element in a UI and giving it a nice transition) but not for more complex situations.

Which brings us to the second and bigger flaw. I added a thousand simple particle sprites at different 3d positions (source code). This runs extremely slowly because of an issue described by Keith Peters involving nested 3d transforms . Nested 3d objects cause excessive bitmap caching, dramatically reducing performance. You might end up with bitmap-caching in action on every 3d object and every DisplayObject containing 3d objects.

In addition, because it’s cached, moving objects further/closer from the camera results in an upsampled/downsampled image. So you tend to get mediocre visual results if your objects move much.

My next step was to stop using the 3d capabilities of DisplayObjects, and just position them in x/y based on their 3D position (source code, notice it is two files now). This gave a massive performance gain. At low quality, 1000 particles runs at 1440×700 at acceptable FPS. Most of the overhead is in the Flash renderer, not the code to update DisplayObject positions, but it still takes a while to do all the transformation, and it causes a lot of pressure on the garbage collector from all the 1000 temporary Vector3D instances that are created every frame. (600kb/second or so – not insignificant.)

Next I figured it would be helpful to make my camera move around (sample code).

This required that I understand the coordinate space all this operated in. What are the coordinate spaces? According to the official docs, screen XY maps to world XY. So forward is Z+, up is Y-, right is X+. Once I figured that out, I had to prepare a worldMatrix with the transform of the camera, then append the projectionMatrix. The PerspectiveProjection class always seems to assume screen coordinate (0,0) is the center of the projection so you will have to manually offset. Maybe I was not using the projection right, since the docs imply otherwise.

There were two other details to sort out. First, I had to reject objects behind the camera, and second, I had to scale objects correctly so they appeared to have perspective. The solution revolved around the same information – the pre-projection Z. By hiding all objects with Z < 0 and scaling by focalLength / preZ, I was able to get it to behave properly.

Next up is Matrix3D.transformVector… which is slow. Transforming 1000 vectors eats 3ms in release build! This is really slow in absolute terms (Ralph Hauwert has a good example of the same functionality running much much faster). I didn’t really want to introduce Alchemy for this project. But we can use AS3 code that avoids the allocations, saving us GC overhead and getting us an incremental improvement in performance.

Andre Michelle has some interesting thoughts on the problem of temporary objects related to transformations (see http://blog.andre-michelle.com/2008/too-late-very-simple-but-striking-feature-request-for-flash10-3d/). I did notice that Utils3D.projectVectors had some options for avoiding allocations, but it did not seem to run significantly faster (even in release build). (sample code for using projectVectors)

In the end, I settled on my own implementation of transformVectors, as it seemed to give the best balance between performance and ease of us. There’s a final version of the sample app where you can toggle between transformVector and the AS3 version by commenting out line 105/106 up on github, so you can test it for yourself. The transform function took some effort to get right, so here it is to save you the pain of implementing it yourself. It transform i by m and stores it in o.

        final public function transformVec(m:Matrix3D, i:Vector3D, o:Vector3D):void
        {
            const x:Number = i.x, y:Number = i.y, z:Number = i.z;
            const d:Vector. = m.rawData;

            o.x = x * d[0] + y * d[4] + z * d[8] + d[12];
            o.y = x * d[1] + y * d[5] + z * d[9] + d[13];
            o.z = x * d[2] + y * d[6] + z * d[10] + d[14];
            o.w = x * d[3] + y * d[7] + z * d[11] + d[15];
        }

Time for some conclusions. I think that the 3D capabilities built into DisplayObject are OK, but very focused on light-weight graphic design use. Building a significant 3D application requires you write your own rendering code built on Flash’s 2D capabilities (either DisplayObjects or drawTriangles and friends). The 3d math classes are ok, but immature. Some things are very handy (like the prepend/append versions of all the methods on Matrix3D), but the tendency for Flash APIs to implicitly allocate temporary objects limits the usefulness of some the most central API calls. In addition, important assumptions like the order of the values in Matrix3D.rawData were not documented, leading to frustrating trial and error. I am excited to see Flash’s 3d capabilities mature.

Ben Garney

December 14, 2009

Code, Flash, How To

12 responses to “3D in Flash 10 & Git”

Pedram Pourhossein

March 14, 2010 at 8:44 am

Hi Ben, I just found your blog, It's really cool, I like it. 😉 thanks for your good blog.
Ben Garney

January 25, 2010 at 10:01 am

I saw your blog post. Demo runs nice and fast. Look forward to hearing what you think of PBE. 🙂
Ben Garney

January 25, 2010 at 10:01 am

Thanks for sharing your numbers. I look forward to hearing what you come up with. 🙂
ianpretorius

January 23, 2010 at 1:55 pm

Hi BenGreat experiments and revalations , I have just started getting into writing 3D code in flash. I noticed posts here from Away3d folk , I love that they have an open source engine available but I love it even more to learn how to do things for myself and to understand how it all works , if i didnt do that I dont think I would be interested in my occupation any longer 😉 I will be following your blog and code samples as you have way more experience than me and I seem to share a love for coding game engines. I recently just took my own Tile engine experiment off of the “unfinished experiments self” a month ago and gave it a solid month of development in my spare time knowing nothing about PBE existing and been open source, I will have a look under the hood and see where i could learn some more there too.
Stéphane Rainville

December 29, 2009 at 7:30 pm

Here are my benchmark result for 10,000 Vector3D tranformation (Mid class laptop)Utils3D.projectVectors 20 msAdding screen/2 to all corrdinates 25msCustom AS3 transformation code (x/z and y/z etc..) 172 msToo bad PerspectiveProjection.projectionCenter does not work outside of a Sprite (my understanding at least)Will definitely give Alchemy a try. Will post results within 2 weeks (Vacation yeah)Thanks for the tip
Ben Garney

December 29, 2009 at 6:08 pm

I think the way to go is doing it all in custom code to avoid allocation overhead. The math is (unfortunately) more involved, but the payoff is total control and better performance. If you were able to, putting your transform pipeline into Alchemy would probably yield a significant speedup over that.
Stephane Rainville

December 29, 2009 at 4:24 am

I think I just hit the same problem you did where setting PerspectiveProjection.projectionCenter has no impact whatsoever on the underlying Matrix ( toMatrix3D) so all my drawing point to the Upper Left corner (0,0) when I use Utils3D.projectVectors.Do you still think your custom code is the way to go or did you find out how to make it work?I'm looking at away3D code and they seem to bypass PerspectiveProjection and create the Matrix manually using heavy math for Utils3D.projectVectors. I have yet to find out if they Manually PUSH the X Y to the center of the screen.
Urieal

December 16, 2009 at 7:09 pm

I'd be interested in any results you come up with. Thanks for the source code sharing on git…helps me be a better coder 😀
Ben Garney

December 14, 2009 at 8:03 pm

I thought so too but my no-allocation path seems to perform similarly. I am going to have to get out gskinner's perf test harness and run some real comparisons, because I was not able to get really consistent results from my tests.When you say “As running lets say 50000 through them, comes at a similar cost.”, do you mean, you have seen the Matrix3D/Utils3D APIs perform better?
Ben Garney

December 14, 2009 at 8:02 pm

Go on?I saw the most recent perf number for away3d (from http://sleepydesign.blogspot.com/2009/08/flash-…) was ~14k faces at 30Hz. I assume you are using tristrips to reduce transform overhead, so it's actually a little bit less than that many vertices.30Hz means a ms per frame of 1000/30=30ms. If I'm getting 1000 verts in 3ms, then 10,000verts in 30ms isn't significantly better. And I am running on a MacBook Pro, so if you are on a fast quadcore desktop, the difference in performance is probably due to faster RAM/CPU rather than any improvement in code.On the other hand, if you are able to get a faster time than I, I'd definitely be interested in hearing more about what you're doing. I had a few instances that seemed to run 10x faster, but I was never able to get it to reliably reproduce. So it's entirely possible that there are some big potential wins. I'd love to find out I am wrong. 🙂
Ralph

December 14, 2009 at 12:24 pm

Hey Ben,Although that 3ms for transform vector on a 1000 particles is slow, I'm thinking that's an allocation thing. As running lets say 50000 through them, comes at a similar cost.Ralph.
katopz

December 14, 2009 at 10:43 am

you never try away3dlite?