With the recent release of AIR 2.7, I thought it would be a good idea to return to the basics of efficient code. While the performance is greatly improved on mobile and desktop, it’s still the coders responsibility to stick to best practices and avoid performance killers. The same is true for Flash inside a browser (Flash Player 10.3).
Slow and heavy code is more apparent on a mobile platform – often the desktop hardware will compensate with a powerful CPU and plenty of memory so you might not even notice that your application or game is lagging.
Intro: The Runtime
Every game or application makes use of two major resources: CPU and memory. How you use them will determine how it will run: hogging the CPU or memory will result in a slow or choppy run, and in some cases may crash the app. The following are a couple of things every decent Flash developer should understand about the Flash player and AIR runtime. By the way, every time you read ‘Flash Player’ here it refers to the AIR runtime as well.
Flash is a JIT environment (JIT= Just In Time) and uses a VM (Virtual Machine) dubbed AVM2. The Flash player uses a familiar model of frame cycles – a loop were each frame is processed and then rendered to the screen in sequence. Running on a single thread, the frame execution model divides the frame time slice into two distinct sections, one for user code execution and one for rendering content to screen (we ignore the green part for simplicity). The player tries to balance the two, in a model that is often referred to as the elastic racetrack.
Objects in memory are garbage collected, and there is very little control on when and how fast an object would be collected and released. Memory allocation and release are costly operations and best be batched together in few blocks rather than small objects. Also keep in mind that large single object fragment less than multiple small objects.
Now without further ado, on to list.
10. Bad code, in general
Bad code is typically the root cause of performance problems. Simply put, if its not architected and designed to be efficient it will not be efficient. A few scenarios come to mind:
- Coding a large system piece by piece without ever considering the big picture.
- Leaving old legacy code intact long after the business requirements have changed. Again, lack of design before coding.
- Coders are pressured to meet deadlines under a tight schedule and abandon any practices, in hope of getting the job done faster – this usually has the opposite effect.
9. Using Flex when its not needed
If you choose to use Flex over pure AS3, there better be a very good reason for it. Simply because you need a few labels and a couple of buttons in your app is not reason enough. There is a huge overhead involved with the Flex component framework, both in memory and CPU usage. Treat Flex as a necessary evil and avoid it if possible – there are plenty of alternate UI component frameworks that are lighter, like MinimalComps or Reflex, to name a few. In many cases its easier to implement few UI control yourself, based on supplied graphic assets.
8. Abusing MXML in Flex Apps
If despite the above you are using Flex, try to minimize the use of MXML, and use it n high level components only. MXML markup generates code on build time, and often coders are not aware that they can actually write it themselves better.
A typical way coders abuse MXML is by nesting boxes and groups only to create a layout, each with conditional properties (left/right/top/bottom). This will code unnecessary bloated code and hog CPU cycles on every redraw, since the layout need to be recalculated across all those nested groups.
The same goes for data binding – using binding when you can simply assign a value is a sure way to overload the app, particularly when its used on dozens of fields.
7. Misusing the MVC pattern
While a good programmer knows how to use patterns, a great programmer knows when to avoid them. Bending a component so that is uses MVC is wrong if the pattern is not needed.
Take a simple widget as an example: one could separate the code into a view class with 4-10 lines of code, a mediator class with all the functionality and throw in a few command classes, each to activate a specific function on the mediator. Almost all classes will require injection and you’ll end up with 4-6 different classes. Now, just to wire them up is more effort than the entire widget – this is over engineering: producing pieces of code that are overly complicated where simplicity is needed.
So for small widgets and controls, opt for the simple one class combo with a few public functions, and write it with as few lines as humanly possible. Simplicity is the key, and MVC should be left for top-level components in more complex cases.
6. Not using a Tweener
Tweeners are the secret weapon of flash coders. When you need animation sequencing or delayed calls, its best to use a tweener. I use TweenMax and its delayedCall() method to make delayed calls with ease and zero overhead, but any other tweener will do just as good.
Speaking of performance, a tweener is almost guaranteed to work faster than any half-baked solution you cooked up when you didn’t have the time to design properly. It has its own internal timer so you never have to use one yourself.
5. Using Timers
When people ask me what is the best way to use timers, my answer is simple: don’t use them – ever.
Besides the fact that timers are inaccurate and not very easy to use, they have a certain overhead that can add up and cripple the performance of your game. Timers hook into the Flash Player environment and calculates time on each frame, using CPU cycles on every frame per each timer that is created. These will quickly add up to slow down the app – the more timer instances linger in the system, the more you can expect slow and choppy gameplay or unresponsive UI.
As an aternative, use a tweener or simply use the ENTER_FRAME event on the stage so it would be triggered on every frame.
Just one word of warning: enter frame is a system event – to minimize impact, use it sparsely. If you are writing a game, it’s best to use it just for one object (your main game class) and have a single centralized enter frame handler make all the subsequent calls to game objects.
4. Memory Allocation and Release
As already mentioned, allocating memory is a costly operation.Whenever you use ‘new’ to create an object, it triggers a bunch of low-level functions inside the AIR runtime/Flash player, such as heap allocation, garbage collection and buffer locking. Its important to avoid creating and destroying objects frequently or inside the game loop. Either create them ahead of time or use Object Pooling to avoid the performance hit.
Regarding garbage collection, stick to the basics: don’t leave any references behind (especially class members) and clean up your classes so they get picked up. Frequently calling System.gc() (in AIR only) is not a good idea since it will be invoked anyway by the player, and may cause an overhead.
Pay special attention to anyscronous objects like loaders: you need to call loader.unloadAndStop() to release the loaded content.
3. Overloaded display list
The Flash Player use a deferred rendering model based on a scene graph, or a display list. This means that instead of issuing draw calls, we place display objects on the stage and the rest is done internally in the player runtime environment.On every render cycle, it traverses through the display list hierarchy top to bottom and draws every object in its place. While the implementation is internal to the player, the time it takes to draw one frame depends on the size and complexity of the display list.
One way to reduce render time is to minimize the amount of display objects – by removing any hidden or unnecessary objects. In a game you might want to keep references to the game objects and remove them from the display list when they are off-screen. Also make sure that removed object are actually disposed, by removing any references and event listeners and stopping movie clips and sounds, otherwise they will keep using CPU cycles on every frame.
2. Drawing API and Vector Rendering
Rendering in the Flash player consists mainly of rasterization and compositing, and both could form a potential bottleneck.
In compositing, alpha blending and filters are applied. Keep these under tight lid, specially filters can be extremely CPU intensive. One way to optimize filters is to compose them off screen on by drawing the filtered object into a bitmap, and then use the bitmap in place of the original object.
Rasterization is the process of drawing all vector shapes and fonts into pixels (in a bitmap). This is where the old ‘Bitmap vs Vector’ debate take place – a trade-off between memory and CPU usage exists. While vector shapes take little space they are slow to rasterize and put a toll on the CPU on every redraw. Bitmaps, on the other hand, take more space up-front but they skip the rasterization on redraw so drawing is much faster.
Bitmaps also allow for baking in post processing effects and skip the run-time filters. Since a large portion of the frame rendering is spent on rasterizing vector shapes, we can try to rasterize some elements ahead of time using off screen blitting.
Some game engines like Flixel and PushButton offer bitmap only rendering, where all display objects are rasterized in advance and then blitted to a central bitmap buffer which is the only object on the display list. With this technique rendering is faster in magnitudes, but it has its own limitations.
Also for some platforms (Mobile devices and TV), AIR has built in GPU acceleration for bitmap rendering – Many developers prefer bitmaps for this reason alone.
1. Using Events
There is no easy way to say this – the Flash event/listener model is one of the weak points in the runtime environment. It’s slow, messy, and requires extra boilerplate code to achieve the trivial. Most Flash coders tolerate it, but you don’t have to – I’ll show a few good alternatives.
So what exactly is wrong with native flash events?
Every time dispatchEvent() is called it makes a copy (or a clone) of the event object. When this happens inside the game loop you get lots of small memory allocations every frame – a sure way to cause a performance hit. From my days writing console games in C++ I was taught to avoid this scenario, and this applies in magnitudes in a JITed environment like AIR or Flash player.
Here’s a partial list of problems in flash event model:
- Custom events require new classes
- You must allocate new objects for each dispatch – a costly operation
- Registering for an event via addEventListener leaves a reference to the listener object that keeps it from being released automatically.
- It does not compare well to modern event models, such as C# events.
In short, this model is slow, heavy and inefficient. Since we are after high performance and efficient memory usage, I would strongly suggest considering an alternative. This is only true for custom events – on display objects you may still need to use Flash events for mouse, keyboard or enter frame – but you can keep these to a minimum.
Writing fast and efficient code in Flash and AIR requires some attention and planning, to avoid common pitfalls. Keeping the display list light and avoid costly operations inside the update loop is key to a smooth and solid performance.