EDIT: I've reported this as a bug on Adboe's Jira, go there and vote for it: http://bugs.adobe.com/jira/browse/SDK-29904
EDIT: I've posted a follow up to this post using the non-debug player: http://jackviers.blogspot.com/2011/05/flex-3-vs-flex-4-follow-up.html
A week ago, a friend and respected colleague of mine, Paul Smith IV, mentioned in passing that Flex 4 was slower than Flex 3. He had no quantitative data to back up the claim and a quick perusal of Google didn't reveal anything either. About to embark on a large migration of a Flex 3 codebase at work, I decided that a performance comparison was in order. My findings show that the average Flex 4 component is 163.41% slower, or 531.77 ms slower, than its direct Flex 3 counterpart, and that difference in component performance increases application startup time linearly in statically defined applications and also applies generally logarithmically as components are dynamically added to the application view. Though Flex 4 applications perform slower, they produce .swf files 82.26% smaller than Flex 3 applications. On a 512 kbps line with 10% overhead, this results in 88.89% faster download times with Flex 4 vs. Flex 3. If the average human speed of perception is 16ms, the perceived performance gain using Flex 4 is 87.97% over Flex 3. Because this perceived performance only applies to initial load, the questions arise: Should Flex 4 be used in large, dynamically downloaded applications? Do the benefits of code reuse by developers, easier view management, and separation of concerns inherent in Flex 4's spark component architecture outweigh the performance degradation at runtime as compared to Flex 3?
Test Machine Hardware and OS
MACHINE: MacBookPro5.2
PROCESSOR NAME: Intel Core 2 Duo
PROCESSOR SPEED: 3.06 GHz
RAM: 4 GB
OPERATING SYSTEM: Mac OS X 10.6.6 (10J567)
Kernel: Darwin 10.6.0
Browser and Flash Player Version
BROWSER VENDOR: Mozilla
BROWSER VERSION: Firefox 3.6.15
FLASH PLAYER VERSION: 10.2.152.33 Debug
Tests
Flex3PerformanceTest
StaticFlex3PerformanceTest100
StaticFlex3PerformanceTest150
StaticFlex3PerformanceTest200
StaticFlex3PerformanceTest250
StaticFlex3PerformanceTest300
StaticFlex3PerformanceTest600
StaticFlex3PerformanceTest1000
Flex3DynamicPerformanceTest
Flex4PerformanceTest
StaticFlex4PerformanceTest100
StaticFlex4PerformanceTest150
StaticFlex4PerformanceTest200
StaticFlex4PerformanceTest250
StaticFlex4PerformanceTest300
StaticFlex4PerformanceTest600
StaticFlex4PerformanceTest1000
Flex4DynamicPerformanceTest
Test Source Code
The source code for these tests can be retrieved and viewed from GitHub at https://github.com/jackcviers/Flex-3---Flex-4-Performance-Test.
Test SDKs
Flex 3: Flex 3.5
Flex 4: 4.1.0.16076
Test Methodology
According to Adobe's DevNet article "Differences between Flex 3 and 4", the following are the Flex 3 components and their new Flex 4 counterparts, separated by a "/" character (original table):
|
Fig 1: Flex 3 Components and Flex 4 analogues. |
For the tests to be valid, I trimmed out the components that had Flex 4 spark.primitives.* analogues, because the Flex 4 primitives are not "components" and do not adhere to the Flex Component Lifecycle. I also used mx.controls.Image in both Flex 3 and Flex 4 as there is no analogue in Flex 4 for the behavior of mx.controls.Image present in Flex 3.
I then set up three types of tests: the first statically defined an application and each of the components in Fig. 1 above, using the Flex 3 MX Components to define the Flex 3 version of the test and the Flex 4 Spark Components to define the FLex 4 version of the test, the second using the longest running component from the first test to test if the amount of instances statically defined at author time had any effect on performance, and the third to test if the amount of runtime changed as the longest running component from test 1 was added to the stage dynamically.
Test 1
Two Flex Projects were created in Flash Builder 4, one for Flex 3 and one for Flex 4. The applications' flex lifecycle events (preinitialize, initialize, creationComplete, and applicationComplete) were bound to corresponding handlers that recorded the times for each event in a dynamic AS3 Object instance. The object instance's top level key was "testerApp". The times were stored in a secondary AS3 Object instance nested under the "testerApp" key as "preinitializeTime", "initializeTime", "creationCompleteTime" and "applicationCompleteTime". Each component defined in mxml was bound to the flex lifecycle events and handlers for "preinitialize", "initialize", and "creationComplete". The only difference between the handlers in the test applications was to cast the Flex 4 IVisualComponent instances to UIComponent instances to obtain the components' ids to create keys in the time recording Object instance. At "applicationComplete" the application looped through all the immediate visual children of the applications, tracing the time elapsed for each component between "initialize" and "creationComplete" along with the components' id property.
Test 2
In the same two Flex Projects, I created several more applications to test the longest running components from Test 1 above (in both cases the mx/spark Button component) defined at runtime in non-repeater statically defined component instances numbering 100, 150, 200, 250, 300, 600, and 1000. Each component and the application and measurement was bound to lifecycle events and measured as in test 1, with the results traced as in Test 1.
Test 3
In the same two Flex Projects I created an additional application to test dynamic addition at runtime of the longest running components from Test 1 above (again, mx/spark Button components). In this version of the test I added a timer that added the instances dynamically every 10 ms for 1000 repetitions and measured the results as in the above two tests after the 1000th repetition. The timer was created in the applications' preinitialize event handler and started at application complete.
Expectations and Assumptions
Before running the test, I expected that Flex 4 would be an improvement over Flex 3 in performance. The pattern of Separation of Concerns used in Flex 4 architecture should have and did allow framework developers to reuse more business/behavioral classes to display vastly different look and feels necessary for common control components. The visual characteristics for the skins should have been in smaller and more performant classes. All bindings should have been reduced and the code reuse for the various components should have allowed for more time testing and improving code performance. Additionally, the base classes of both frameworks existed in the previous Flex 3 release and should have been improved by the community over the time period between 3.5 and 4.1's release.
Results
Test 1
The table above is arranged by the slowest to fastest components. The difference is the difference in component runtime in Flex 4 and Flex 3. In all cases Flex 4 was slower. However, Flex 4 produced a much smaller .swf file. I graphed the component runtimes in a bar chart:
|
Fig 2. Actual Component runtimes from Flex 4(Red) and Flex 3(Blue).
mx Precedes component names in Flex 3, s precedes component names in Flex 4. |
What is particularly interesting in both the table and chart above is that Flex 4 is about 1.5 times slower at runtime than Flex 3. However, the file size produced by Flex 4 is 82% smaller despite needing 474 more lines of mxml and ActionScript 3 code to achieve. I believe that the decrease in file size is in direct correlation with the code reuse enabled by the new skinning architecture in Flex 4. As you can see, there were nine less component classes needed to create the analogous test applications.
To calculate perceived performance, I used 16 ms as the time it takes for the human eye to perceive change. Since the download time on anything above 512Kbps based on the size of the produced Flex 4 swf was 0 seconds, I used 512Kbps to estimate download time. The Flex 4 swf downloads 9 seconds faster than the larger Flex 3 swf, an ~89% improvement in download time. This results in a human perceived performance gain in the Flex 4 application, because though it runs slower it downloads much faster and can start earlier after page load than the Flex 3 application. Because the test applications were small (an application with only 26 components, each), and because application complete time was less than the total amount of component rendering time, I decided that further testing was necessary, and decided to test the run time of statically defined Flex 4 applications vs. statically defined Flex 3 applications. To do this test, I selected the poorest performing component from each framework (the Button component) and tested them with 100, 150, 200, 250, 300, 600, and 1,000 buttons on the display list. I wanted to test all the way to 20,000, but the mxmlc build failed with not enough memory at 10,000 buttons in Flex 4.
Test 2
For this test I was only interested in the time it took for the application to run between "preinitialize" and "applicationComplete". The results are below:
As you can see, Flex 4 again was the poorer performer. I graphed the Flex 4 and Flex 3 application run times to visualize the difference:
|
Fig. 3 Flex 4 vs. Flex 3 Application Runtimes with trend lines and projected period results. |
As you can see, Flex 3 and Flex 4 run times increase linearly with the number of components added, and Flex 4's rate of increase is higher. The projected results fit the observed data very well.
I also graphed the difference at each level:
|
Fig. 4 Difference in Flex 4 runtime vs. Flex 3 runtime by number of statically defined longest running components. |
This difference also progressed in a generally linear manner with the number of components. It appears that the number of components defined and on the display list at author time does effect overall application performance and confirms that Flex 4 architecture does perform worse than Flex 3 when statically defined. However, in a small application like that in Test 1 the smaller file size will still result in a faster human perceived performance.
This led me to wonder what would happen in a large, dynamic application. Would component run time increase linearly throughout the life of an application where components were added dynamically? Would dynamic addition of components occur more quickly in Flex 4 vs. Flex 3? Over time, in a large application, would runtime performance improve as a result of additional engineering for larger applications built into Flex 4's new architecture? So I wrote test number three, which builds off of tests 1 and 2. This time, I was interested in the component runtimes of dynamically added buttons, added one at a time to the display list at 10 millisecond intervals until the timer had run 1000 times. Then I would measure the component runtimes and graph the results.
Test 3
The results of Test 3 were much more complex than those of Tests 1 and 2. As you might expect, after the application complete event, adding components is much more expensive for a period of time, then it levels off and begins a linear performance. The result table is long, but Flex 3 performed better again. What is most interesting is the visualization of the data:
|
Fig 4. Flex 4 vs. Flex 3 Performance with dynamically added components. |
This chart is very complex, because the data returned from this test was very complex. The Flex 4 results perform in a generally Logarithmic manner: very high component run times until more than 60 components are added, then a steep linear trend from components 48 - 450, then a more gradual trend above 450 components. Flex 3 also is generally logarithmic and follows a similar pattern, but its trend lines start at a lower number of milliseconds than Flex 4's, and finish lower than Flex 4's: indicative of better performance in this test.
Conclusions and Additional Points of Discussion
From the above three tests, Flex 4 always performs more poorly than Flex 3 at run time. Each of the spark components performed more poorly than the mx counterpart and the processing of the display list appears to have regressed from Flex 3's performance in both static and dynamic component addition. According to test 1, the Flex 4 components perform at an average of 531 milliseconds or 163.41% worse than their Flex 3 counterparts. Flex 4 produces a much smaller swf executable than Flex 3, thus improving small application's perceived performance time, but ultimately in large, dynamic applications the improvement in download time will only be noticeable for the initially statically defined views and module download times. Any components dynamically added up to the 60th component will probably cause performance issues and a jittery UI. Although Flex 3 has a bigger initial hit, it performs better throughout the lifetime of the application.
Flex 4 has better look and feel extensibility of its built-in components, meaning that you won't be spending as much time in development extending and sometimes duplicating base flex components from a business/behavior logic standpoint, but this extensibility comes at a performance cost over 1.5 times worse on average per component to the end-user.
When we write software, we shouldn't be as concerned with how HARD that software is to write, maintain, and extend as we are about end-user experience. The web has taught us that performance should be our number one concern when writing web applications, and thus Flex 4, while easier to develop and maintain than Flex 3, and thus less likely to be broken by a poorly skilled developer, should probably be avoided until Adobe and the Flex community can bring its performance on par with Flex 3. There is no reason that the overhead of separating view from behavior should have decreased component performance by nearly double when the base codebase of both architectures should have improved from Flex 3 to Flex 4. The community of Flex Framework developers should probably be looking to improve performance in the Flex 4.5 release, rather than adding more components to the spark architecture.
What do you think? Feel free to download the source code from Github and the spreadsheets as well. Comment back. I'd love to see your feedback.