October 01, 2002
Generating C# or VB.NET CodePaul Kimelman
Paul shows how to write a code generator in VB.NET that is capable of generating a strongly typed collection of any type in C# or VB.NET.
The CodeDOM is one of the interesting features of .NET. It is comprised of the classes in the System.CodeDom and System.CodeDom.Compiler namespaces. These classes allow you to write code generators that are capable of generating C# or Visual Basic.NET code and write that code out to a module or compile to an assembly. Generating code is cool. Technoids love it. The ability to generate code means that well-known Patterns and Refactorings can be generated flawlessly for either C# or VB.NET. I'll show you how to write a code generator in VB.NET that is capable of generating a strongly typed collection of any type in C# or VB.NET. A strongly typed collection is a collection that contains a known type. In .NET, these typed collections are derived from the CollectionBase or ReadOnlyCollectionBase classes. Strongly typed collections contain an indexer, and as a result can be employed like an array; and typed collections implement one or more of IList, IListSource, IEnumerable, and ISerializable. As a result, you can bind a typed collection to a control, like a DataGrid, and serialize a typed collection, returning instances of your typed collections from web services. These are all very powerful capabilities. An example of a strongly typed collection of Player objects is shown in Listing 1.
Note that I created an emitter that generated MSIL representing a strongly typed collection before I created the code that generated the VB.NET shown in Listing 1. After I created the MSIL emitter, I created the sample code you see in Listing 2. Subsequent to that, I discovered the ListBuilder.cs sample code that was provided by Microsoft. The ListBuilder.cs sample code also creates a strongly typed collection.
The collection is very simple. It contains a default indexer, which means we can treat instances of PlayerList just like an array, and the Add method supports a dynamic increase in the size of the collection something you don't get from an array.
PlayerList inherits from CollectionBase. CollectionBase implements IList, IListSource, and IEnumerable. As a result, you can use PlayerList objects as arrays or in a For Each statement. Listing 4 demonstrates a PlayerList being used as a DataSource for a Windows Forms DataGrid.
Generating Code
The most convenient way to think about the CodeDOM is that it contains classes, and those classes have members that can perform the same operations you would perform interacting with Visual Studio .NET and your keyboard. That is, the CodeDOM can create modules, add references, namespaces, types, and add members to those types, saving the generated code to a source code file in VB.NET or C# or compile the code to an assembly. This is precisely what you do as a VB.NET programmer. The key, then, is finding out what the name of the classes and members are that perform these tasks. Learning about CodeDOM classes will not take that long. The hard part is writing code that generates bug-free code. Thankfully, the CodeDOM manages the syntax; we just have to get the algorithms correct. The complete listing for the strongly typed code generator is provided in Listing 2. The line numbers are provided for reference. If you are new to VB.NET, remember not to include the line numbers when you are experimenting with the code. Additionally, every programmer has a slightly unique style. Following the rule of divid et impera, or divide and conquer, I prefer to break classes into well-named, singular methods that perform tasks named by the method. Well-named, singular methods save time on commenting to some extent and make individual methods very easy to debug.
The methods in Listing 2 are a bit longer than I prefer and could stand a bit of Refactoring. For example, you will notice that there are many in-line instances of object creation. This is perfectly acceptable in VB.NET, but we could probably Refactor some of this code into named properties and reduce the number of times the objects need to be created or reduce the amount of in-line object creation.
Generating VB.NET or C# Code
Code generators are essentially subprograms that write code. The real trick is to write a code generator that writes bug-free code. One of the most reliable ways to do this is to write generators based on well-known Patterns (although you can write any kind of generator you want to). The benefit though is that once your generator is correct, every bit of code it generates will be flawless every time. As I mentioned earlier, and as demonstrated from Listing 2, code generators need to do the same things a programmer does when creating a new subprogram. Specifically, a generator needs to start by creating a unit. We'll start by exploring the code that begins the code generation part first, including generating the unit.
Generating Units
You can generate VB.NET or C# code with the same generator. The ListGenerator class demonstrates how this works. To generate a strongly typed collection with the ListGenerator, invoke one of the two shared methods GetVBCode on line 56 or GetCSharpCode on line 68 in Listing 2. Shared methods that create objects are referred to as factory methods. Listing 3 demonstrates how to invoke the VB.NET code and load the code into a RichTextBox. The first statement in the Try block invokes the ListGenerator.GetVBCode method. (It just happens that the class and the namespace are identical; they are both ListGenerator. I used the namespace, class, and shared-method name.) GetVBCode writes the code to a .vb file using the name of the class. For example, if the name of the type is Player, then List.vb is appended to the generated type to create the file name. (This is a convention I employed.)
The second statement loads the generated code into the RichTextBox control, assuming an exception didn't push us into the Catch block. Remember to use the RichTextBoxStreamType.PlainText when loading the generated VB code into the RichTextBox. By default, the RichTextBox tries to load text as rich text (.RTF text), but VB code is not rich text.
Both GetVBCode and GetCSharpCode operate the same way, with one distinct difference: Each creates a different version of the CodeDomProvider class. GetVBCode creates a VBCodeProvider (refer to line 61 of Listing 2) and GetCSharpCode creates an instance of the CSharpProvider (refer to line 73). After that point, the code is identical whether you want to generate VB or C# code.
GetCode requires the type of the object we will be collecting in the list, the specific CodeDomProvider, and a suffix used to generate the output source file name. If we will be outputting VB code, then the suffix will be "List.vb" or "List.cs" for C# code.
Both methods branch to the shared method GetCode. GetCode creates an instance of the ListGenerator class recall that all three methods, so far, are shared methods and invokes the ListGenerator.WriteCode method. The provider we created in the GetVBCode or GetCSharpCode supplies the object that implements ICodeGenerator. All of the code that is generated begins in the instance method WriteCode.
There are two versions of WriteCode. The second version on lines 96 through 114 of Listing 2 initiates the code generation sequence, after successful preparation. A StreamWriter is created on lines 99 and 100. The StreamWriter is used to write data to a stream; in this case the generated code will be written to the stream. The target stream is a FileStream; that is, the generated code will be written to a file. A CodeCompileUnit is created on line 102. The CodeCompileUnit class is the only object that can be compiled using the CodeDOM. (You could, however, generate the code and use a command-line compiler, but we want this code compilable by the generator.)
The rest of the code is created by following the named methods I implemented in sequence in the WriteCode method. The provider determines the code-generated language. The ICodeGenerator is requested from the provider. The rest of the generator is a matter of stringing together elements found in any .NET program, including classes, methods, fields, properties, events, and other elements. If you understand these basic concepts, then you can follow the code pretty easily. Let's take a look at some of the other salient pieces of the code generator.
Generating Comments
The WriteComments method I implemented added the comment on line 1 of Listing 1. Lines 117 through 122 return a formatted string, and the ICodeGenerator and a CodeCommentStatement object created on line 127 of Listing 2 is used to write a comment to the stream. CodeCommentStatement is constructed with the text of the comment we want to write, and ICodeGenerator.GenerateCodeFromStatement is used to insert the comment text into the stream. The CodeCommentStatement and the TextWriter (the stream) are passed as arguments to GenerateCodeFromStatement on line 128 of Listing 2.
Generating Namespaces
The WriteNamespaces statement creates a CodeNameSpace object. This object is used to insert Imports statements into the CodeCompileUnit. Line 135 creates the CodeNameSpace object. Lines 136 and 137 add the Imports statements for the System and System.Collections namespaces, which are needed to refer to the CollectionBase class later. The CodeNameSpace object is added to the CodeCompileUnit on line 139 of Listing 2. There are just a couple of extra pieces of information that I want to point out about the WriteNamespaces method. The first is that the ICodeGenerator is superfluous in this method. ICodeGenerator is central to the CodeDOM, and I just make a habit of passing it around in case I need it. Second, note that the Imports statement in VB.NET is equivalent to a using statement in C#. Lines 136 and 137 will generate an Imports statement for VB and a using statement for C# code.
Generating Classes
The WriteClass method is responsible for generating the class. Classes are elements of namespaces, which may be an unfamiliar concept to VB programmers. Simply think of namespaces as a higher level of aggregation than the class. Because classes are elements of namespaces, we add the object that represents the class to the namespace, which occurs on lines 149 through 157. Line 151 creates a CodeTypeDelcaration object initialized with the name I will be using for my new type. I set the IsClass property to True. For creating a struct, we could set IsStruct to True. Line 153 adds the new type to the namespace's collection of defined types. Line 154 is used to express an inheritance relationship. CodeTypeDeclaration.BaseTypes is a collection indicating the classes that our class will be inheriting from. The easiest way to create a strongly typed collection is to inherit from System.Collections.CollectionBase, as on line 154 of Listing 2. Line 155 adds the Public access modifier to our new class.
There are a couple more important points. The first is that many of the objects I am passing to methods are actually members of the ListGenerator class. Since they are members, I do not have to pass them as arguments to the methods; stylistically, it would be better if I didn't. I passed arguments like [NameSpace] to hopefully make it clear what was needed in a particular method. Second, notice that the argument [NameSpace] is enclosed in brackets. NameSpace any capitalization in VB.NET is a reserved word. You can use reserved words in nonreserved contexts in VB.NET by using the [], and you have to for at least one reserved word, Assembly. Assembly is meta instruction for assembly-level attributes and the name of a class. To invoke a class method on the Assembly class you will need to enclose Assembly in []. You will see that this was done in Listing 4.
Generating an Indexer
The WriteIndexer method is 28 lines long, or about 25 lines longer than I prefer. However, indexers need very specific things, and I elected to keep them all in one place rather than creating a lot of temporary variables or factoring out in-line code, making the listing more complex to follow perhaps. An indexer property is the property that permits you to treat the typed collection like an array. It is a property, however, so you need a getter to read an indexed value and a setter to write an indexed value. WriteIndexer is subdivided into two parts. Lines 169 through 175 write the getter method, and lines 177 through 184 write the setter methods. These are actually compiled as get_propertyname and set_propertyname in MSIL. For example, an indexer property will be encoded as a simple property named "Item" with two associated methods, get_Item and set_Item, generated by the compiler. When we are writing the generator we need to generate these two methods ourselves. (Figure 1 shows the MSIL for the PlayerList class from Listing 1 with the get_Item method selected.)
The getter method returns the object referenced by the index argument. From Listing 2, lines 169 through 175, the code to generate the getter looks pretty complex. However, if you understand the code, it isn't too bad. Basically, the indexer will get an index argument, which will be used to index an InnerList property inherited from CollectionBase. The return result of that is typecast to the type we are returning from the strongly typed collection. The code on lines 169 through 175 generates an inline return statement shown in Listing 1.
The setter has two arguments. The first is the index of the element to set and the second is the object that we want to insert into the collection.
Line 171 demonstrates how to perform the typecast on the return type in the getter using the CodeCastExpression object, and lines 174 and 181 demonstrate how to refer to the internal reference to self argument referred to as Me in VB.NET and this in C#. Finally, the CodeMemberProperty object representing a property in the CodeDOM is added to the CodeTypeDeclaration object's Members collection for the obvious reason that classes have members.
Generating Methods
Finally, the Add method is generated. The Add method supports adding elements to the collection exceeding the current maximum number of elements. Inherited from CollectionBase, InnerList is an ArrayList and supports dynamic growth represented by the InnerList.Add method. You can obtain run-time type information types in the System namespace with GetType(type). For example, GetType(Integer) will return the Type object for the Integer type. However, you will need to refer to the assembly name to use this technique for most types because the framework has to be able to load the assembly to discover the type information.
The CodeMemberMethod represents methods in the CodeDOM. As is true with all methods, there is a return type, a method name, optional arguments, access modifiers, optionally attributes, and code. WriteAddMethod demonstrates how to create most of these elements using classes in the CodeDOM.
There are many kinds of code that you will want to generate. You might want to generate enumerated types, exception handling blocks, structures, fields, constructors events, and event handlers. All of these elements can be generated with the CodeDOM. You will have to explore the classes in System .CodeDom and System.CodeDom.Compiler to find out more. For example, you will need to explicitly generate a default constructor for the typed collection if you want to return it from a web service. This isn't especially hard to figure out now that you know the basics. I left it as an exercise. If you are having problems then write me at pkimmel@softconcepts.com for a hint.
Compiling Generated Code
After the code tree has been created we can generate the code, saving it to a file, or we can actually compile the code using an ICompiler. This is possible because we started with a CodeCompileUnit. Of course, you can also generate the code and use the command-line compiler or add the code to a new or existing project. Line 110 demonstrates how to generate the code text and the shared method ListGenerator.CompileCode on lines 34 through 53 of Listing 2 demonstrate how to actually compile the code to an assembly. The code is pretty straightforward; let's walk through it quickly.
Line 36 creates the provider. I used a VBCodeProvider. What you will find out with .NET is that writing in VB.NET or C# or combining the two really doesn't matter. You can mix both languages into the same application. I picked VB because this article has emphasized VB.NET. Line 38 generates the code using the GetCode method we spoke about already. Line 41 declares an ICodeCompiler and we ask the VBCodeCompiler to create a compiler instance for us. (Another example of a factory method in use.) Line 42 creates a CompilerParameters object used to represent arguments to the compiler. Lines 43 and 44 create a dependency on the Player.dll library. The PlayerList.dll will contain a collection of Player objects. (I cheated here. You will need to use the commented-out code on line 45 to refer to the actual assembly of whatever type your collection will be holding. Line 46 dynamically creates a name for the assembly using the convention collectedtypeList.vb. Line 47 indicates that we are compiling to a DLL, and line 48 indicates that the assembly will be written to disk.
Finally, the assembly is compiled on lines 50 and 51, returning a CompilerResults object that contains the same information you see in Visual Studio's Output window when you compile an assembly. Assuming the CompilerResults object contains no errors you will have an assembly that can be used like any other, as well as the source code that goes with it. You can modify the source code or use it for debugging purposes.
Creating and Using the Dynamic Typed Collection
You already know how to reference assemblies and use the classes in them. You do this every time you create a .NET project. To be thorough, I provided an example that shows you how to dynamically load the PlayerList.dll assembly and create an instance of the assembly without stopping to add a reference to the assembly. This is done dynamically in Listing 4. Line 1 generates the code and compiles it based on a typed named Player. Line 2 uses the shared method to dynamically load an assembly. Here is an example where I am forced to use the [] around the class name [Assembly] because it is also a reserved word. Line 4 retrieves the type object for the PlayerList, which is used to create an instance using an Activator on line 5. Line 7 creates a player inline, adding it to the strongly typed PlayerList collection, which in turn, is bound to a DataGrid.
Summary
I could have really used the CodeDOM in the early 1990s when I was writing Basic 7 for DOS or even Visual Basic for DOS. There is a lot of potential wrapped up in the CodeDOM, much of which hasn't been discovered yet. On my current project, we are exploring using the CodeDOM to create code generators for an ASP.NET and C# application. The generators we have already written create strongly typed collections, and the ones we are working on will be instrumental in automatically generating ASP.NET web pages based on our application's object model. In the future it is also possible that the capability for software to write code may lead to adaptive applications that evolve in capability automatically. It is a bit of science fiction still, but it is possible to write code that generates complex types, including classes, in .NET.
Paul Kimmel is an architect and programmer for Software Conceptions Inc. Paul is the frameworks columnist for Windows Developer Magazine and the author of The Visual Basic .NET Developer's Book; Addison-Wesley, Fall 2002. Paul is currently working on an Enterprise application in Oregon. You may contact him at pkimmel@softconcepts.com.
|
|
||||||||||||||||||||||||||||
|
|
|
|