C# 8 will bring many new features. Jon Skeet investigates the metadata representation of Nullable Reference Types.
Background: Noda Time and C# 8
C# 8 is nearly here. At least, it’s close enough to being ‘here’ that there are preview builds of Visual Studio 2019 available that support it. Unsurprisingly, I’ve been playing with it quite a bit.
In particular, I’ve been porting the Noda Time source code [ Skeet-1 ] to use the new C# 8 features. The master branch of the repo is currently the code for Noda Time 3.0, which won’t be shipping (as a GA release) until after C# 8 and Visual Studio 2019 have fully shipped, so it’s a safe environment in which to experiment.
While it’s possible that I’ll use other C# 8 features in the future, the two C# 8 features that impact Noda Time most are nullable reference types and switch expressions . Both sets of changes are merged into master now, but the pull requests are still available so you can see just the changes:
The switch expressions PR is much simpler than the nullable reference types one. It’s entirely an implementation detail… although admittedly one that confused docfx, requiring a few of those switch expressions to be backed out or moved in a later PR.
Nullable reference types are a much, much bigger deal. They affect the public API, so they need to be treated much more carefully, and the changes end up being spread far wide throughout the codebase. That’s why the switch expression PR is a single commit, whereas nullable reference types is split into 14 commits – mostly broken up by project.
Reviewing the public API of a nullable reference type change
Nullable reference types | |
|
So I’m now in a situation where I’ve got nullable reference type support in Noda Time. Anyone consuming the 3.0 build (and there’s an alpha available for experimentation purposes [ NodaTime ]) from C# 8 will benefit from the extra information that can now be expressed about parameters and return values. Great!
But how can I be confident in the changes to the API? My process for making the change in the first place was to enable nullable reference types and see what warnings were created. That’s a great starting point, but it doesn’t necessarily catch everything. In particular, although I started with the main project (the one that creates NodaTime.dll), I found that I needed to make more changes later on, as I modified other projects.
Just because your code compiles without any warnings with nullable reference types enabled doesn’t mean it’s ‘correct’ in terms of the API you want to expose.
For example, consider this method:
public static string Identity(string input) => input;
That’s entirely valid C# 7 code, and doesn’t require any changes to compile, warning-free, in C# 8 with nullable reference types enabled. But it may not be what you actually want to expose. I’d argue that it should look like one of the options in Listing 1.
// Allowing null input, producing nullable output public static string? Identity(string? input) => input; // Preventing null input, producing non-nullable // output public static string Identity(string input) { // Convenience method for nullity checking. Preconditions.CheckNotNull(input, nameof(input)); return input; } |
Listing 1 |
If you were completely diligent when writing tests for the code before C# 8, it should be obvious which is required – because you’d presumably have something like:
[Test] public void Identity_AcceptsNull() { Assert.IsNull(Identity(null)); }
That test would have produced a warning in C# 8, and would have suggested that the null-permissive API is the one you wanted. But maybe you forgot to write that test. Maybe the test you would have written was one that would have shown up a need to put that precondition in. It’s entirely possible that you write much more comprehensive tests than I do, but I suspect most of us have some code that isn’t explicitly tested in terms of its null handling.
The important part take-away here is that even code that hasn’t changed in appearance can change meaning in C# 8… so you really need to review any public APIs. How do you do that? Well, you could review the entire public API surface you’re exposing, of course. For many libraries that would be the simplest approach to take, as a ‘belt and braces’ attitude to review. For Noda Time that’s less appropriate, as so much of the API only deals in value types. While a full API review would no doubt be useful in itself, I just don’t have the time to do it right now.
Instead, what I want to review is any API element which is impacted by the C# 8 change – even if the code itself hasn’t changed. Fortunately, that’s relatively easy to do.
Enter NullableAttribute
The C# 8 compiler applies a new attribute to every API element which is affected by nullability. As an example of what I mean by this, consider the code in Listing 2, which uses the
#nullable
directive to control the nullable context of the code.
public class Test { #nullable enable public void X(string input) {} public void Y(string? input) {} #nullable restore #nullable disable public void Z(string input) {} #nullable restore } |
Listing 2 |
The C# 8 compiler creates an internal
NullableAttribute
class within the assembly (which I assume it wouldn’t if we were targeting a framework that already includes such an attribute) and applies the attribute anywhere it’s relevant. So the code in Listing 2 compiles to the same IL as this:
using System.Runtime.CompilerServices; public class Test { public void X([Nullable((byte) 1)] string input) {} public void Y([Nullable((byte) 2)] string input) {} public void Z(string input) {}} }
Note how the parameter for
Z
doesn’t have the attribute at all, because that code is still
oblivious
to nullable reference types. But both
X
and
Y
have the attribute applied to their parameters – just with different arguments to describe the nullability. 1 is used for not-null; 2 is used for nullable.
That makes it relatively easy to write a tool to display every part of a library’s API that relates to nullable reference types – just find all the members that refer to
NullableAttribute
, and filter down to public and protected members.
It’s slightly annoying that
NullableAttribute
doesn’t have any properties; code to analyze an assembly needs to find the appropriate
CustomAttributeData
and examine the constructor arguments. It’s awkward, but not insurmountable.
I’ve started doing exactly that in the Noda Time repository, and got it to the state where it’s fine for Noda Time’s API review. It’s a bit quick and dirty at the moment. It doesn’t show protected members, or setter-only properties, or handle arrays, and there are probably other things I’ve forgotten about. I intend to improve the code over time and probably move it to my Demo Code repository at some point, but I didn’t want to wait until then to write about
NullableAttribute
.
But hey, I’m all done, right? I’ve explained how
NullableAttribute
works, so what’s left? Well, it’s not
quite
as simple as I’ve shown so far.
NullableAttribute in more complex scenarios
It would be oh-so-simple if each parameter or return type could just be nullable or non-nullable. But life gets more complicated than that, with both generics and arrays. Consider a method called
GetNames()
returning a list of strings. All of these are valid:
// Return value is non-null, and elements aren't null List<string> GetNames() // Return value is non-null, but elements may be null List<string?> GetNames() // Return value may be null, but elements aren't null List<string>? GetNames() // Return value may be null, and elements may be null List<string?>? GetNames()
So how are those represented in IL? Well,
NullableAttribute
has one constructor accepting a single
byte
for simple situations, but another one accepting
byte[]
for more complex ones like this. Of course,
List<string>
is still
relatively
simple – it’s just a single top-level generic type with a single type argument. For a more complex example, imagine
Dictionary<List<string?>, string[]?>
. (A non-nullable reference to a dictionary where each key is a not-null list of nullable strings, and each value is a possibly-null array of non-nullable elements. Ouch.)
The layout of
NullableAttribute
in these cases can be thought of in terms of a pre-order traversal of a tree representing the type, where generic type arguments and array element types are leaves in the tree. The above example could be thought of as the tree in Figure 1.
Figure 1 |
The pre-order traversal of that tree gives us these values:
- Not null (dictionary)
- Not null (list)
- Nullable (string)
- Nullable (array)
- Not null (string)
So a parameter declared with that type would be decorated like this:
[Nullable(new byte[] { 1, 1, 2, 2, 1 })]
But wait, there’s more!
NullableAttribute in simultaneously-complex-and-simple scenarios
The compiler has one more trick up its sleeve. When all the elements in the tree are ‘not null’ or all elements in the tree are ‘nullable’, it simply uses the constructor with the single-byte parameter instead. So
Dictionary<List<string>, string[]>
would be decorated with
Nullable[(byte) 1]
and
Dictionary<List<string?>?, string?[]?>?
would be decorated with
Nullable[(byte) 2]
.
(Admittedly,
Dictionary<,>
doesn’t permit null keys anyway, but that’s an implementation detail.)
Conclusion
The C# 8 feature of nullable reference types is a really complicated one. I don’t think we’ve seen anything like this since async/await. This article has just touched on one interesting implementation detail. I’m sure there’ll be more on nullability over the next few months…
This article was first published on Jon Skeet’s coding blog on 10 February 2019 at https://codeblog.jonskeet.uk/2019/02/10/nullableattribute-and-c-8/
References
[Microsoft] Background information: https://devblogs.microsoft.com/dotnet/nullable-reference-types-in-csharp/
[NodaTime] Alpha build: https://www.nuget.org/packages/NodaTime/3.0.0-alpha01
[Skeet-1] https://github.com/nodatime/nodatime
[Skeet-2] PR1240: Support nullable reference types, available at: https://github.com/nodatime/nodatime/pull/1240
[Skeet-3] PR 1264: Use switch expressions, available at: https://github.com/nodatime/nodatime/pull/1264
is a Staff Software Engineer at Google, working on making Google Cloud Platform rock for C# developers. He’s a big C# nerd, enjoying studying the details of language evolution. He is @jonskeet on Twitter, and his email address is on his Stack Overflow profile.