Unity Myth Buster: GameObject.Transform VS Cached Transform
Table of Contents
Caching the transform was one for the first Unity optimizations I ever learned. It was drilled into me like scripture. But things change all the time, and Unity is the fastest evolving software platform I have ever used. Kudos to you Unity! Now I started using Unity back at the end of the 2.x cycle and we have come along way. IL2CPP is freaking amazing. The more profiling I have been doing, the more blown away I am by it. You will clearly see why by the results.
Enough blather, what we are going to be doing here is evaluating our assumption that caching the transform object is faster than referencing it from script or a game object. We are going to be taking our readings from device to be as accurate as possible. In this case, a Samsung Galaxy S8 (Android). I will be providing a link to the evaluation project so please report back with numbers from other platforms if you get different results. Test results from the editor are unreliable and don't allow you to make use of IL2CPP optimizations.
The Setup #
I wanted to check both performance and memory allocation here so I had to do some shenanigans. The most accurate way to get memory allocation readings that I know of is the Unity Profiler. To use that effectively I needed to separate the tests so that I could fire them manually after I had connected the profiler. So I made UI but the Unity canvas and event system creates its own garbage, especially when logging strings. Ultimately, I have a button per test. This then triggers a 30 frame countdown, then triggers the test and waits for another 30 frames to display the results. This way the actual test is isolated from anything else.
- Unity Version: Unity 2019.1.5f1
- Iterations: 1,000,000
- Compiler: Mono & IL2CPP
- Platform: Android 7.0
- Device: Samsung Galaxy S3
You can find this test with many others in my GitHub repo. Please download and run your own tests. The more data the better! https://github.com/collectivemass/UnityPerformanceTests
The Results #
MONO ================================ Cached Transform….: 195.678700000 GameObject.Transform: 281.524700000 Cached Transform….: 187.988300000 GameObject.Transform: 279.846200000 Cached Transform….: 193.573000000 GameObject.Transform: 271.087600000 Cached Transform….: 198.028600000 GameObject.TranTransformfom: 268.554700000 Cached Transform….: 198.974600000 GameObject.Transform: 273.742700000 Cached Transform….: 200.866700000 GameObject.Transform: 279.724100000 Cached Transform….: 202.941900000 GameObject.Transform: 278.808600000
IL2CPP ================================ Cached Transform….: 130.416900000 GameObject.Transform: 197.128300000 Cached Transform….: 127.395600000 GameObject.Transform: 190.551800000 Cached Transform….: 136.505100000 GameObject.Transform: 184.997600000 Cached Transform….: 137.603800000 GameObject.Transform: 209.228500000 Cached Transform….: 129.791300000 GameObject.Transform: 201.721200000 Cached Transform….: 131.897000000 GameObject.Transform: 203.094500000 Cached Transform….: 137.832600000 GameObject.Transform: 198.318500000 Cached Transform….: 118.972800000 GameObject.Transform: 198.013300000 Cached Transform….: 132.797200000 GameObject.Transform: 189.025900000 Cached Transform….: 129.715000000 GameObject.Transfom: 186.721800000
Nothing too unexpected here. Caching is faster, but not by as much as I thought it would be. We are looking at a 1.4029x improvement using mono and a 1.4870x improvement using IL2CPP. I also tested MonoBehaviour.Transform and it was on par with GameObject.Transform but actually slightly slower on Android. I also ran this test on my MacBook Pro and had almost identical results.
The biggest surprise for me was the lack of memory allocation. All methods were clean, with so pink lines in timeline. If you are making your own tests, be wary of string construction and the Unity Event System. (I think the new one is better)
You should cache your transforms, especially in an update loop but it’s not detrimental. It will get you some frames, especially if you are moving a lot of stuff but don’t expect over 1.5x improvements. No allocations happened so no garbage was generated. Use IL2CPP, it will get you about a 1.5x improvement on your transforms.
As a bonus, I tried out ARM64 on Android, which has just recently become a feasible option. This gave me a further 1.27x script execution performance increase. For dramatic effect, if you were not caching your transforms and using Mono and you switched to caching transforms, IL2CPP and ARM64, you would get a whopping 2.74x!