Update 5th Feb 2019: Test for this can be found at https://github.com/nawforce/ClassDeploymentTests
When working with multiple managed packaged I have often wondered why single file deploy times vary so much. Classes in some packages only take a few seconds to deploy while in other packages classes with similar purpose can consistently take much longer. The difference can be very large, it often feels like up to 10 times worse.
I have performed some experiments to try and understand what is happening but before we get there I should mention this is more than casual interest for me. If you do a bit of research you will find the link between poor system response times and developer productivity has been studied a few times. While the evidence is not entirely conclusive (for me) it’s pretty strong. How much time developers lose due to long deploy times is always going to be hard to gauge, but if I ask my colleagues at work about this link no one argues it does not exist.
Show me the data!
Ok, fair enough. Let’s start with this:
This is a log of the time taken to deploy individual classes taken from the same managed package via the tooling API. Each dot represents one class with the y-axis showing an average milli-seconds taken to deploy over 10 attempts. The data is presented in the order it was collected along the x-axis.
The first thing you might notice is that around about the 100th class tested there was some kind of brownout. The deploys continued to work but were much slower for a few minutes. I didn’t investigate the cause of that but it’s not what interested me, I was really interested in the time distribution of the other deploys.
There is a clear indication here of two clusters, classes that deploy in ~3.5 seconds and those that typically take 7 seconds. Repeating this test on a different package shows that 3.5seconds is virtually always best case but the higher cluster location location varies, so in the other package I tested the higher cluster was closer to 20 seconds. Another observation is that percentage of classes in each cluster can vary. In the graph above the two clusters are roughly equal in size but in another case approx ~90% of classes tested were in the slower to deploy group.
This is clearly quite unusual behaviour, I was expecting to find some kind of distribution to deploy times but not quite this. Understanding why we see these clusters requires delving into code patterns which I will likely do in a future post but for now we can learn more by studying the deploy time behaviour of some generated classes.
My first thought on seeing the graph above was that the deploy times differences were being caused by the need to invalidate other classes. If you have spent much time developing with Apex you will know classes can becomes ‘invalid’. To see this go to the Setup->Apex Classes and add the IsValid flag to the view. The flag is used to indicate that although the class was valid when deployed (all are) some metadata has changed that might mean it is no longer syntactically valid. The flag is indicating to the runtime that it will need recompiling/rechecking before next use. You can get rid of all the invalid classes by hitting the ‘Compile all classes’ action on that page.
To test this I generated long chains of classes that called a method on the next class in the chain so they became dependant. If you then update the class at the end of the chain you can then force a large number of classes to become invalid during the update. This test did show part of the deploy time was coming from the need to perform invalidation but even with very long chains (up to 1000 classes) the impact to the deploy times was not large enough to explain what we see with package code.
Use the trees
Having failed to identify a cause looking at invalidation the next step was to look at class dependants as logically this is the inverse, so if Class A calls a method on Class B we might say updating B invalidates A or equivalently A depends on B. To test dependencies I created various sizes of binary tree using classes to get this result.
The simplest 1-layer tree consisted of a root node class that calls methods in two other classes. In a 2-layer tree the root node class calls methods in two other classes but each of those calls methods in two more classes. If you run this test with no other code in each class then you can see an increase in deploy times but its not that clear. I found I could create the result above by adding additional code to each class that only depends on platform types.
What this graph is showing is a near linear increase in the time to update the root class of the tree as the tree size grows. There could be a few reasons for this so time to dig a bit deeper with another experiment.
For this test I added ‘weight’ to the classes by using a number of identical small blocks of code . This means that when the number of classes doubles I could half the number of code blocks to keep the overall amount of Apex code in the tree about constant.
What the data is showing is that for small->medium class numbers the time to update the root class of the tree is proportional to amount of code in the class and all dependent classes.
With high numbers of classes we can see that that deploy times start getting worse but it’s not the linear relationship we saw earlier, so it’s a smaller factor to consider than just the amount of code in all dependent classes.
Back to invalidation
With the Salesforce Apex runtime being essentially a black box understanding why something happens as apposed to what happens can be very difficult if it’s not documented. In this case I am going to speculate a bit and try and explain why we might be seeing this behaviour.
The invalidation result earlier got me thinking about what kind of data structure that could be being used to invalidate classes very quickly. In this context I think it might be important to understand that class references in an org support some very dynamic behaviours. For example, I can replace a platform class with my own class of the same name (don’t do this it’s a horrible anti-pattern). If I did do this how are existing class references handled?
My guess is (and that’s all this is) is that class references are always being resolved during class loading. This would mean the only way to perform quick invalidation would be to store a list of classes names that should be invalidated with each class so that when it is updated there is no need to analyse class dependencies. If that is the case then when you update a class you may need to recompute an invalidation list for all the dependent classes by analysing the code in the dependency tree.