Corey Lasley
Posted on April 21, 2020
This article is sort of a Part II to my Poor Man’s Profiler article, in which I have taken profiling code to the next step. Initially, this seemed like it would have been a relatively easy task, but I ended up spending many more evenings than I expected tweaking things in order to find success. Often the thought went through my mind: “Why am I doing this, and is anyone actually going to find this useful?” Well I am not one to back down from a coding challenge, and even if I am the only one who gets anything out of this exercise, I wanted to finish what I started. Maybe this will be helpful to someone else out there.
Introduction to this Project
Despite being a developer for two decades, up until more recently, I had simply only known about CIL (Common Intermediate Language) often simply called “IL” and how the .NET compiler works, but never really gave it much thought beyond that, simply because (perhaps surprisingly,) I never really had a good reason to actually dig in and understand it. When I put together my proof of concept code and article on my Poor Man’s Profiler, I started doing some digging into more optimal approaches to profiling code and logging method consumption (i.e. ways to further limit code redundancy and eliminate unnecessary overhead.)
Of course I could go the easy route and use an existing product like PostSharp, but this exercise wasn't simply about finding an existing solution, rather, it was meant to be a learning experience to try to figure out how something like this could be accomplished. In all honesty, being an engineer, I really love a good challenge, and don’t like to just leave all the from-scratch fun programming to others, even if in this case, I may be reinventing the wheel a bit.
So this was when I ventured down a path I hadn't traveled before, discovering metaprogramming and IL weaving. It didn't take long after going down this path, before it clicked that I could disassemble .NET assemblies (via ildasm,) modify the IL code, and reassemble them in a modified state (via ilasm,) it was like I was really seeing for the first time, and the gears in my head really started cranking with ideas.
Granted, I cannot think of a whole lot of uses for directly modifying/writing IL code, that is, unless you are a glutton for punishment. It isn’t the easiest code to understand, as it looks very Assembler’ish. Because it is stack based (push / pop) it gets incredibly lengthy rather quickly, and if you are not careful, you can really dork things up.
I can at least think of a few reasons for needing to directly modify IL, albeit uncommon, such as writing an obfuscator, or create one’s own .NET programming language for academic purposes. The big reason for modifying/writing IL was the no-brainer reason why I was down this road to begin with: code profiling!
Two major aspects of my Poor Man’s Profiler solution didn’t sit well with me from the get-go, but I continued the project and article regardless to demonstrate the evolution of a concept from good to better. What was obviously not so great with my Poor Man’s Profiler:
Having to add additional code (albeit simple) all over an application where one wants method data logged
Having to rely heavily on boxing when logging parameter/variable values.
Through modifying IL code, an entirely different approach can be taken:
In very direct use, a static solution could be written so that no additional code would need to be added to application code, not even a new class library (But in this solution, we are going to take it beyond this.)
With a little additional code added to each class in an assembly, the weaving solution can become very customizable, by allowing the developer to handle logging events.
No more explicit boxing of parameter/variables.
Profiler logging calls are weaved into the methods of IL code post-build and could be used as part of a CICD process.
A Little Background on the Common Intermediate Language
Given the fact that IL code is the “low level” of what all .NET languages (i.e. C#/VB.NET/F#/etc.) compile down to, it’s a good idea to get a little familiar with it. You can take a deep dive into CIL by reading through the ECMA-335 Standards. Wikipedia has a good introduction to the Common Intermediate Language. An excellent book that I think belongs in every serious C# developers library is CLR via C# by Jeffrey Richter which is a great resource, and for a real deep dive, the book .NET IL Assembler by Serge Lidin (a book I do not yet have.)
With a couple resource available, I found that the best way for me to wrap my head around IL, was to simply write some C# code, such as in a console app, build it, and then use ildasm to disassemble it to readable IL. At this point I could look at my .NET code and compare it with its corresponding IL code, and identify patterns of translation. This gave me a solid idea of what I needed to do to generate my own IL code blocks that I could then inject into the code, and finally reassemble via ilasm.
Using a data/text comparison tool, such as WinMerge makes it much easier to see the differences between original code, and what code should look like after injecting new code. This takes quite a bit of guesswork out of things.
Writing my Own IL Parser, Code Generator, and Injector/Weaver Solution
The purpose of my exercise for this project and corresponding article was:
Log execution time of methods.
Log parameter names and values passed into methods, including all primitive types and types that implement IEnumerable.
Limited code needed to be added to a consuming solution, only simple attributes at Class level and/or Method level, depending on how fine tuned one wants things.
Only one method with a specific signature and an Attribute needs to be written in the consuming solution, giving the developer full control over the actual logging process. Method execution time, and parameter data is passed to this method.
Application assemblies are modified post-build to inject all profiling code into applicable methods.
Consumption of this solution should be extremely simple and straight forward.
Having gone through a 101 on IL; On a technical level, the final product will do the following:
Disassemble a DLL or EXE in IL code with ildasm.exe
Parse the IL code into an object structure making IL code analysis and injection possible.
Discover the location of the Logging Handler method (based on a Method level attribute)
Based on specific Class/Method level attributes, determine which methods need IL modification.
If any methods that will be profiled/logged are found to have a parameter which implements IEnumerable, generate a method to convert the value of the enumerable to a comma delimited list string.
Generate IL code to assign parameter names and values (i.e. “[parameter]=[value]; ”) to a single string. For IEnumerable types, call our generated conversion method, and inject at the beginning of the method.
Generate IL code to initiate and start a StopWatch object, and inject into the beginning of the method.
Modify the existing “maxstack” and “.locals init” of the method to accommodate for the StopWatch and Parameter list string objects.
Adjust the indexes on the existing “sdloc”, “ldloc”, and “ldloca” instructions of the method to accommodate the injected code above.
Generate IL code to stop the StopWatch, and inject at the end of the method.
Generate IL code to make a call to the discovered Logging Handler method, passing it the calling method name, the string holding our parameter names and values, and the StopWatch’s elapsed milliseconds, and inject at the end of the method.
Save the modified IL code to disk.
Assemble the modified IL code by calling ilasm.exe (including its resource file that was generated when disassembled if one exists).
Overwrite the original assembly.
Cleanup all temporary working files on disk.
After a couple of weeks, working for an hour or two in the evenings, I was able to successfully pull this off.
The CIL Weaver Solution
I am not going to go into the internals of the my code here because there is a lot to dissect. Rather, I will go into how what I have written can be used to weave profiling code into your assemblies. My CIL Weaver Profiler solution on GitHub contains the following projects:
Core Projects
CILWeaveProfiler (.NET Standard Class Library): This contains all the real beef, containing the code to automate disassembly/assembly, and IL code parsing/writing. There is quite a bit to take in here. Not only does this library contain the logic for modifying assemblies, but it is also the library that you will need to reference in your application that you want to be modified for profiling (more on this later)
CILWeaver.Console (.NET Core 3.1 Console): This application consumes the CILWeaveProfiler class library and performs the calls needed to disassemble an assembly (exe or dll), parse the IL, make modifications to it based on the contents of the existing IL, save the changes, and reassemble the assembly. This could be used manually, or as part of a CICD process. NOTE: If you want to use this, you may need to make some minor modifications.
Development Aids
ReferenceApp (.NET Core 3.1 Console): This is a sandbox application that I use to write test .NET (C#) code so that I could disassemble it, inspect the IL, and figure out what I needed to do within CILWeaveProfiler to accomplish the end goal.
ILStringify.Blazor (.NET Core 3.1 Blazor SPA): A quick tool that I whipped up to help me turn large blocks of IL code into C# string assignment code that I could use while developing CILWeaveProfiler. I figured that I might as well include this. Being so comfortable in the .NET world, I absolutely love how simple Blazor is, and how quickly I can slap a simple SPA together to build quick tools in .NET Core (a topic for another day.)
How the Solution is Applied to Your .NET Project
Prepping Your Code
Through the use of custom Attributes, which is essentially metadata applied to target elements in our .NET code (i.e. Classes and Methods,) we can customize our logging process. Attributes, as most .NET developers are familiar with, are generally used within the application or sometimes the compiler, however we will use these attributes only within the CILWeaver program (well actually the CILWeaveProfile library,) as they are, more or less, simply IL modification directives, used to determine which methods are going to be profiled and how (what sort of IL modifications need to be made.) Additionally, we point the IL modification process to the existing method that will consume the profiled data, so that the developer has the final say into to how the data is actually logged, while the cumbersome task of gathering the profiled data is taken care of and passed to this method.
First, you will want to reference the CILWeaveProfiler Class Library in your project.
In order to tell the CIL Weaver how to modify the application you will use Attributes which require the following reference:
using CILWeaveProfiler.Attributes;
The two Attributes that can be applied are ProfilerClass and ProfilerMethod:
ProfilerClass: This is a class level attribute that can be used to apply how the CILWeaver program will treat all of it’s methods by default. If a method level attribute (see below) is applied to a class’s method, the method level attribute will override this attribute for the particular method.
[ProfilerClass(LoggingType = CILWeaveProfiler.LoggingTypes.All)]
class Program
{
…
}
For LoggingTypes you can specify:
- All: Meaning, by default, unless a Method attribute states otherwise, profile the Execution Time and Parameters passed to the Method.
ExecutionOnly: Meaning, by default, unless a Method attribute states otherwise, only profile the Execution Time of the Method.
ParameterValuesOnly: Meaning, by default, unless a Method attribute states otherwise, only profile the Parameters passed to the Method.
None: Meaning, by default, unless a Method Attribute states otherwise, do not profile the Class’ methods.
If the Class is not marked with the Attribute, it is the same as if it was marked with None. The Class does not need to be marked with the Attribute for Methods to utilize the Attribute.
ProfilerMethod: This is a method level attribute that can be used to tell the CILWeaver program if it should profile the method and how if different from the class level attribute. If this attribute is NOT applied to a method, but the ClassAttribute is applied at class level, the attributes of the class level attribute will be applied to the method.
Little is actually gained by using ExecutionOnly or ParameterValuesOnly, as the modified code will simply pass in an empty string for parameters or a 0 for milliseconds to the logging method. The only gain would be in the small overhead required for instantiating and executing a StopWatch or converting parameter values to strings.
NOTE: Inspecting the code, you may notice that the ClassAttribute and MethodAttribute are identical in their properties, so why didn’t I just use one Attribute for both? This is
because I may want to eventually do more with these attributes that
are applicable more to one or the other, but not necessarily both.
Additionally, and importantly, through the use of another simple attribute, we mark the particular method in our code in which the weaver will inject calls to, so that we do not have to rely on a baked-in logging functionality, but rather, how profiled data is actually logged, is in full control of the developer.
LoggingMethodOverride: This can be applied to a method within the class, or inherited class, and the method can have any name. The method does however require a specific parameter signature. The parameters must be (string methodName, string parameters, long milliseconds) This method will never be profiled itself, or else endless recursion would occur until you reach stack overflow. Note this is not a true override, as consuming classes do not inherit from a special base class, or implement an interface in which a specific logging method needs to be overridden. However, if you do not mark a method with the required signature with this Attribute, absolutely nothing will be logged.
Example
[LoggingMethodOverride]
static void LogIt(string methodName, string parameters, long milliseconds)
{
Console.WriteLine("LOGGING -> " + methodName + "(" + parameters + ") executed in: " + milliseconds);
}
As stated, we can define our logging method with the LoggerAttribute in a base class from which our other classes inherit, so that we have one definition per assembly.
[ProfilerClass(LoggingType = CILWeaveProfiler.LoggingTypes.All)]
class Program
{
[ProfilerMethod(LoggingType = CILWeaveProfiler.LoggingTypes.None)]
static void Main(string[] args)
{
// We do not want to profile this method
DoSomething();
DoSomethingElse(123, new List<DateTime> { DateTime.Now, DateTime.Now.AddMonths(2), DateTime.Now.AddYears(5) });
}
static void DoSomething()
{
// Profiling of this method will inherit the Attribute of the Class since no method level Attribute is specified
}
[ProfilerMethod(LoggingType = CILWeaveProfiler.LoggingTypes.ParameterValuesOnly)]
static void DoSomethingElse(int id, List<DateTime> dates)
{
// For this method, we only want to profile Parameters
}
[LoggingMethodOverride]
static void LogIt(string methodName, string parameters, long milliseconds)
{
// This method is marked with the Attribute of [LoggingMethodOverride], and the CIL will be modified
// in the other applicable methods to gather profile data and pass it to this method
Console.WriteLine("LOGGING -> " + methodName + "(" + parameters + ") executed in: " + milliseconds);
}
}
Executing the IL Weaver on your Assembly
Included in the solution is the CILWeaver.Console program which can be used as the basis for your IL Weaver which utilizes the CILWeaverProfiler library. However with just a handful of lines of code you can quickly integrate the process into your own weaver app. Simply add a reference to the CLIWeaverProfiler, and get rolling.
using CILWeaveProfiler;
This gives you access to the Weaver object which contains simple methods for executing the Disassembly, Modification, and Assembly process.
// Disassemble the assembly...
DisassembleInfo dIL = w.Disassemble("[PATH & FILE NAME]");
string IL = dIL.CIL;
// Parse the disassembled IL Code
Assembly asm = w.ParseILCode(IL);
string newIL = asm.GenerateAssemblyILCode();
// Display the modified IL Code
Console.WriteLine(newIL);
// Assemble the modified IL Code
bool success = w.Assemble(IL, dIL, ReassemblyTypes.OverwriteOriginal);
Console.WriteLine(success ? "SUCCESS!" : "Failed to reassemble");
Making Your Own Modifications to the IL Weaver
Consult a list of CIL instruction such as List of CIL instructions on Wikipedia and get familiar with what you are seeing in your IL code, so that you have a better understanding of how you need to make modifications.
Unless you really feel comfortable with IL code, I must emphasize the usefulness of a data/text comparison tool, such as WinMerge to aid in the process of writing the C# code to generate IL code.
For an IL novice such as myself, the best way to determine what sort of IL code was needed to be generated, and where to put it, I found to be in the following process:
Write a program which contains all the code I would expect to see in my final product.
Build it, and disassemble to IL code.
Save this IL code to a file somewhere.
Remove all the code that should not be in the original assembly (the code that will be injected later,)
Build it, and disassemble to IL code.
Save this IL code to a file.
Finally, load my two versions of IL code into a tool such as the aforementioned WinMerge, and compare them to get a good idea of changes that need to be made.
Sometimes after making IL changes, you may get ilasm to successfully assemble the code. However, after running the executable, you may receive a “System.InvalidProgramException: Common Language Runtime detected an invalid program” This means that syntactically the code was good, but there is an issue with it that ilasm clearly did not detect. With this, and if examining the IL doesn't reveal the issue, you should be able to debug the IL in Visual Studio. A really old article (like 18 years old!) on how this might be able to be done is found here I didn't actually attempt this however.
Another suggestion that I came across was to use the PEVerify.exe utility, however the results of this utility led me on a wild goose chase that wasted a ton of my time, and I was never able to get to the bottom of the issues I had. I discovered that using the utility PEVerify.exe, I consistently got the following 3 errors on a reassembled DLL, whether I modified its original disassembled IL code or do not. During my development process, at one point I was using this utility in my process for extra verification, and I ended up spending a lot of time assuming that something was wrong with my modified IL code in the wrong places, until I ran the utility on an assembly where I had simply disassembled a freshly built DLL to IL and then reassembled it back to a DLL without any modifications.
[MD]: Error: EntryPoint method has invalid number of arguments, expecting 3. [token:0x06000004]
[MD]: Error: EntryPoint method has invalid return type. [token:0x06000004]
[MD]: Error: EntryPoint method has invalid argument number 1. [token:0x06000004]
Despite the error messages above, the EXE consuming the DLL and its internals ran perfectly fine.
Some Quick Tips: Issues I ran into developing this code
When injecting IL code into existing methods, the sdloc, ldloc, and ldloc index values need to be adjusted accordingly, depending on code injection placement! In our case, it makes it somewhat straightforward that our code injections happen both at the beginning and ending of existing applicable methods. If you push and pull the wrong indexes at the wrong time, strange things can happen, and likely a hang or throw an unhandeled exception.
Make sure the maxstack values are adjusted appropriately based on the code you are injecting. If the modified application attempts to push more on to the stack than allocated in maxstack, the application will crash. It is a good idea to set the maxstack at a value that truly represents the max size of the items your method will push to the stack (as opposed to just putting some arbitrary high number there.)
This is what gave me the most grief…. In IL, a string doesn't like to be converted to a string. This seems obvious enough, but C# doesn't mind this, and will allow for code calling .ToString() on a string (the compiler makes adjustments for this in the IL). All other object types I have tested can be converted to a string without issue. In my code which concatenates all of the parameters as single string (thus essentially calling .ToString() on all parameters,) I needed to adjust my code generator to make an exception to this rule if the parameter type is already string. Let me elaborate on this...
As an example, lets say we have a method, with it’s first parameter being a string, and the parameter’s name is “str,” the following code block marked “No Good” was a pattern that works just fine on parameters of other types, but throws an error on string type parameters. Adjusting code to follow a different pattern for a string type parameter solves the problem:
/* No Good for a string */
IL_0011: ldc.i4.1
IL_0012: ldarga.s str
IL_0013: call instance string [System.Runtime]System.Object::ToString()
IL_0014: stelem.ref
IL_0015: dup
/* Good for a string */
IL_0011: ldc.i4.1
IL_0012: ldarg.0
IL_0013: stelem.ref
IL_0014: dup
Summary
To wrap things up here, I hope that what I have provided here within this article as well as the code in the associated GitHub solution can help to provide some insight as to how one might go about modifying a .NET executable post-build to inject additional functionality such as profiling.
Posted on April 21, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.