An area of Apex runtime that I have often found difficult to grasp is how the Schema namespace works. As part of work on ApexLink I had to explore this so thought I best write some notes about what I found and then thought maybe other Salesforce developers would be interested in some of this so lets make it a blog…
A namespace in Apex is similar to a package in Java, it’s a container within which you will typically find the familiar classes, interfaces and enums which I will collectively refer to as types. There are quite a lot of namespaces in the Apex runtime, ApexLink has definitions for 37 but only a few are commonly used.
Two of the namespaces (System & Schema) are significant in that you do not have to use the namespace name to use classes from these. When the Apex compiler is searching for a type it will automatically search these namespaces. You can qualify types in these namespaces with the namespace name if you want, that can sometimes be useful if a type name is ambiguous, but generally it’s not needed.
If you look at the Schema namespace documentation you will find the types shown in the figure. These are the types that are always available but on any given Org you will find lots of other types here that are created to ease access to database records and other types of metadata.
The simplest of these are the SObject types, such as ‘Schema.Account’, which in most Apex code is just written as ‘Account’ since we like typing less characters. Other commonly used types of metadata you can find in Schema include custom settings, custom metadata & platform events.
These additional types are useful to allow you to refer to types statically by name. So in Apex I can simply write:
To create a new Account record that I might later insert to the database.
Org’s, Packages & Lazy Loading
In ApexLink I choose to use a lazy loading strategy for the Schema namespace. There were a couple of reasons for this but you can probably skip this section if you are just interested in learning more about the Schema namespace, come back if you get lost later on.
The ApexLink API provides an API that models a simulated Org. By this I mean to use the API you create and ‘Org’ and inside that you create ‘Packages’ identifying where to load the package metadata from and what other packages they depend on. The Org & Package here are just objects in memory but they give me a way of thinking about metadata management which is similar to how actual Orgs and Packages work which I found useful.
A key change I made though was to enforce that Apex code in a package can only reference types within its own package or those exposed from packages it depends on or the platform provided types. This is a stricter model then is enforced on actual Orgs but it is useful for package developers because it can help us detect if we are using things we should not. This model however has its complications in that for each Package we need to isolate the types that can appear in namespaces like Schema from what another package may be able to use.
In ApexLink this is achieved by lazy loading the additional types needed into the Package that needs them at the point of first reference. This helps reduce memory and cpu usage during the ApexLink analysis. So as the Apex code that creates an Account is analysed the definition of an Account is loaded into the Schema namespace for that Package, if it has not already been loaded before.
In addition to providing types to assist in writing Apex the other feature the Schema namespace provides is to describe the shape of metadata. In Java we would call this feature reflection but Apex only provides describe support for a pretty limited set of types and the supported capabilities are much reduced. The core of this feature is Schema.SObjectType which can be obtained in a few ways.
The third case here is interesting as we appear to be accessing SObjectType twice. ‘SObjectType.Account’ is returning a ‘DescribeSObjectResult’ from which we can then get the actual SObjectType. I added the forth case just for fun, you can do this, I have no idea why it works.
Before going further I should also mentions that the performance of these may vary significantly on real Orgs. Traditionally accessing ‘describe’ data has been expensive since it needs to be pulled into a cache on the server that runs the Apex code. I have not benchmarked these but I would expect that the third one will not be cheap the first time any code uses ‘describe’ data for Account. If you want to know more about describe performance you must read this blog by Chris Peterson.
While looking at SObjectType I noticed an oddity in how it behaves that is shown by this code.
I think most Apex programmers know there is some weirdness in this area but often don’t grasp it because there is no detailed documentation on how Apex works. In this case though you can smell there is something a miss by observing the Account.SObjectType clearly can’t return the same type as say Contact.SObjectType because they have different fields.
What I thought was likely happening here is that ‘Account.SObjectType’ does not return an SObjectType but something derived from a SObjectType, let’s think of it as an ‘AccountSObjectType’ and when I assign that to the generic SObjectType access to the ‘Account’ specific parts are no longer accessible.
In ApexLink, rather than create an SObjectType for each I re-used the generics support needs for List, Set, Map etc and instead of Account.SObjectType used SObjectType<Account>. To cover my magic I then hide this generic type whenever I need to print it by displaying it as ‘Account.SObjectType’, i.e. the thing that generates it rather than what it is.
Hiding my tracks here feels pretty dirty but is necessary to avoid introducing a new abstraction that is visible to programmers. Why Salesforce hid this will become a bit clear later on but as we are about to see the kinds of decisions tend to cascade on you in runtime designs.
In the last section we saw accessing fields directly on an SObjectType to get an SObjectField but you can also access them via the ‘fields’ field. Before looking at that I want to point to a difference here as there is another ‘fields’ field:
I am interested in the second one here, available on SObjectType, I will come back to DescribeFieldResult later on. I have yet to find a description of what ‘fields’ is in this case, if you try to evaluate it without adding a field name a null is returned which is not a great help.
From an ApexLink perspective I again turned to generics to handle this, so ‘fields’ is typed as a SObjectFields<Account> type. For those wondering, although I use generics carry the SObject type here there is another trick being used to allow the fields accessible on a Type to vary independently of that type. At the core of the analysis in ApexLink you don’t iterate over a fixed set of available fields available on a type but call a function ‘findField(name)’ on the type instance. This function is free to return fields from some pre-existing set or may construct fields dynamically as needed which allows the visible fields on SObjectFields<Account> to be different to those on SObjectFields<Opportunity>. In this case I use the type argument, like ‘Account’ to work out what fields should be findable.
It’s still a bit of mystery to me why we can both access the fields directly on SObjectType instances and via ‘fields’, you would have thought one would have been fine. Currently I use the same code to implement findField() for either case but maybe someone can point to a difference between them.
Schema.SObjectType Statics & Describes
Let’s go back a bit and look at the Schema.SObjectType static fields. If you are anything like me at this point your head is starting to spin a bit, so let’s quickly recap. The previous discussion has been focusing on instances of SObjectType but that just part of what is hidden here. In this section we are focusing on the static fields that you can also find on SObjectType.
On SObjectType you can access the ‘DescribeSObjectResult’ and from that the ‘DescribeFieldResult’. We can again demonstrate the problem if you split these.
In this case though there is some documentation that debunks my theory that this was being caused by ‘SObjectType.Account’ not returning a DescribeSObjectResult but something derived from it.
It’s clear from this that the Apex parser has been made able to understand the objects and fields available at runtime but only in very specific contexts. This is something of an anti-pattern in language design exactly because of the confusion it creates between the syntactic and semantic domains that programmers use to help understand errors. In short I don’t understand why the example breaks because my programmer brain is wired to see this as not possible if syntax is cleanly separated from semantics.
From an ApexLink perspective though this is very similar to what we have seen before and can be handled by treating ‘SObjectType.Account’ as returning a hidden generic type that inherits from DescribeSObjectResult so it is assignable to it in a way that will mimic this behaviour. There is almost matching behaviour for FieldSet access which is dealt with in the same way.
Using generics and findField() handling as got me past most problems in the Schema namespace in ApexLink but there were a number of other challenges so I will mention them briefly here in summary.
- Alongside the main type for some metadata we also have to create the companion objects such as Share, History & Feed types. Mostly these are straight forward but RowCause in the Share object required some special handling for its SObjectField.
- When dealing with Lookup and Master/Detail fields the relationship fields need to be created. This has forced me to load all SObject related metadata on startup so the correct related lists can be created. I still have some difficulty here with Rollups which can also create a dependency order during metadata loading which I need to resolve.
- Separating out the various types of metadata that can cause things to appears in the Schema namespace has been rather difficult, not least because in SFDX you can omit the ‘object-meta’ files when adding fields to an existing type which makes detection of which directories contain metadata more complex. [Ed: You should think about extracting your metadata identification and parsing code into a separate library as other might find that useful.]
- There is complexity in ApexLink around determining if to create a new type of extend and existing one when say we are adding a custom field. This is fairly inherent in the problem domain but the relationship between ‘Activity’ with ‘Task’ & ‘Event’ needed some special case handling.
- The idea of there being standard fields on each metadata types has cause quite a lot of grief. In many cases it turns out these are not as standard as you might think they should be, they are often mere conventions (with exceptions).