WindowsDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


AddThis Social Bookmark Button

Generics in .NET 2.0

by Venkat Subramaniam
06/20/2005

Generics in .NET 2.0 is an exciting feature. But what are generics? Are they for you? Should you use them in your applications? In this article, we'll answer these questions and take a closer look at generics usage, and their capabilities and limitations.

Type Safety

Many of the languages in .NET, like C#, C++, and VB.NET (with option strict on), are strongly typed languages. As a programmer using these languages, you expect the compiler to perform type-safety checks. For instance, if you try to treat or cast a reference of the type Book as a reference of the type Vehicle, the compiler will tell you that such a cast is invalid.

However, when it comes to collections in .NET 1.0 and 1.1, there is no help with type safety. Consider an ArrayList, for example. It holds a collection of objects. This allows you to place an object of just about any type into an ArrayList. Let's take a look at the code in Example 1.

Example 1. Lack of type safety in ArrayList


using System;
using System.Collections;

namespace TestApp
{
        class Test
        {
                [STAThread]
                static void Main(string[] args)
                {
                        ArrayList list = new ArrayList();

                        list.Add(3);
                        list.Add(4);
                        //list.Add(5.0);

                        int total = 0;
                        foreach(int val in list)
                        {
                                total = total + val;
                        }

                        Console.WriteLine(
                                "Total is {0}", total);
                }
        }
}

I am creating an instance of ArrayList and adding 3 and 4 to it. Then I loop though the ArrayList, fetching the int values from it and adding them. This program will produce the result "Total is 7." Now, if I uncomment the statement:


list.Add(5.0);

the program will produce a runtime exception:


Unhandled Exception: System.InvalidCastException: Specified cast is not valid.
at TestApp.Test.Main(String[] args) in c:\workarea\testapp\class1.cs:line 18

What went wrong? Remember that ArrayList holds a collection of objects. When you add a 3 to the ArrayList, you are boxing the value 3. When you loop though the list, you are unboxing the elements as int. However, when you add the value 5.0, you are boxing a double. On line 18, that double value is being unboxed as an int, and that is the cause of failure.

(The above example, if it was written using VB.NET would not fail, however. The reason is VB.NET, instead of unboxing, invokes a method that converts the values into Integers. The VB.NET code will also fail if the value in ArrayList is not convertible to Integer. See Gotcha #9, "Typeless ArrayList Isn't Type-Safe," in my book .NET Gotchas for further details.)

As a programmer who is used to the type safety provided by the language, you would rather have the problems pop up during compile time instead of runtime. This is where generics come in.

What Are Generics?

Generics allow you to realize type safety at compile time. They allow you to create a data structure without committing to a specific data type. When the data structure is used, however, the compiler makes sure that the types used with it are consistent for type safety. Generics provide type safety, but without any loss of performance or code bloat. While they are similar to templates in C++ in this regard, they are very different in their implementation.

Using Generics Collections

The System.Collections.Generics namespace contains the generics collections in .NET 2.0. Various collections/container classes have been "parameterized." To use them, simply specify the type for the parameterized type and off you go. See Example 2:

Example 2. Type-safe generic List


List<int> aList = new List<int>();
aList.Add(3);
aList.Add(4);
// aList.Add(5.0);
int total = 0;
foreach(int val in aList)
{
        total = total + val;
}
Console.WriteLine("Total is {0}", total);

In Example 2, I am creating an instance of the generic List with the type int, given within the angle brackets (<>), as the parameterized type. This code, when executed, will produce the result "Total is 7." Now, if I uncomment the statement doubleList.Add(5.0);, I will get a compilation error. The compiler determines that it can't send the value 5.0 to the method Add(), as it only accepts an int. Unlike the example in Example 1, this code has type safety built into it.

CLR Support for Generics

Generics is not a mere language-level feature. The .NET CLR recognizes generics. In that regard, the use of generics is a first-class feature in .NET. For each type of parameter used for a generic, a class is not rolled out in the Microsoft Intermediate Language (MSIL). In other words, your assembly contains only one definition of your parameterized data structure or class, irrespective of how many different types are used for that parameterized type. For instance, if you define a generic type MyList<T>, only one definition of that type is present in MSIL. When the program executes, different classes are dynamically created, one for each type for the parameterized type. If you use MyList<int> and MyList<double>, then two classes are created on the fly when your program executes. Let's examine this further in Example 3.

Example 3. Writing a generic class


//MyList.cs
#region Using directives

using System;
using System.Collections.Generic;
using System.Text;

#endregion

namespace CLRSupportExample
{
        public class MyList<T>
        {
                private static int objCount = 0;

                public MyList()
                {
                        objCount++;
                }

                public int Count
                {
                        get
                        {
                                return objCount;
                        }
                }
        }
}

//Program.cs
#region Using directives

using System;
using System.Collections.Generic;
using System.Text;

#endregion

namespace CLRSupportExample
{
        class SampleClass {}

        class Program
        {
                static void Main(string[] args)
                {
                        MyList<int> myIntList = new MyList<int>();
                        MyList<int> myIntList2 = new MyList<int>();

                        MyList<double> myDoubleList 
                                        = new MyList<double>();

                        MyList<SampleClass> mySampleList 
                                        = new MyList<SampleClass>();
                                        
                        Console.WriteLine(myIntList.Count);
                        Console.WriteLine(myIntList2.Count);
                        Console.WriteLine(myDoubleList.Count);
                        Console.WriteLine(mySampleList.Count);
                        Console.WriteLine(
                                new MyList<sampleclass>().Count);

                        Console.ReadLine();
                }
        }
}

I have created a generic class named MyList. To parameterize it, I simply inserted an angle bracket. The T within <> represents the actual type that will be specified when the class is used. Within the MyList class, I have a static field, objCount. I am incrementing this within the constructor so I can find out how many objects of that type are created by the user of my class. The Count property returns the number of instances of the same type as the instance on which it is called.

In the Main() method, I am creating two instances of MyList<int>, one instance of MyList<double>, and two instances of MyList<SampleClass>, where SampleClass is a class I have defined. The question is: what will be the value of Count? That is, what is the output from the above program? Go ahead and think on this and try to answer this question before you read further.

Have you worked the above question? Did you get the following answer?


2
2
1
1
2

The first two values of 2 are for MyList<int>. The first value of 1 is for MyList<double>. The second value of 1 is for MyList<SampleClass>; only one instance of this type had been created at that point in the control flow. The last value of 2 is also for MyList<SampleClass>, since another instance of this type has been created at this point in the code. The above example illustrates that MyList<int> is a different class from MyList<double>, which in turn is a different class from MyList<SampleClass>. So, in this example, we have four classes of MyList: MyList<T>, MyList<int>, MyList<double>, and MyList<X>. Again, while there are four classes of MyList, only one is stored in MSIL. How can we prove this? Figure 1 shows the MSIL using the ildasm.exe tool.

figure 1
Figure 1. A look at MSIL for Example 3

Generics Methods

In addition to having generic classes, you may also have generic methods. Generic methods may be part of any class. Let's look at Example 4:

Example 4. A generic method


public class Program
{
        public static void Copy<T>(List<T> source, List<T> destination)
        {
                foreach (T obj in source)
                {
                        destination.Add(obj);
                }
        }

        static void Main(string[] args)
        {
                List<int> lst1 = new List<int>();
                lst1.Add(2);
                lst1.Add(4);

                List<int> lst2 = new List<int>();
                Copy(lst1, lst2);
                Console.WriteLine(lst2.Count);
        }
}

The Copy() method is a generic method that works with the parameterized type T. When Copy() is invoked in Main(), the compiler figures out the specific type to use, based on the arguments presented to the Copy() method.

Unbounded Type Parameters

If you create generics data structures or classes, like MyList in Example 3, there are no restrictions on what type the parametric type you may use for the parameteric type. This leads to some limitations, however. For example, you are not allowed to use ==, !=, or < on instances of the parametric type:


if (obj1 == obj2) …

The implementation of operators such as == and != are different for value types and reference types. The behavior of the code may not be easier to understand if these were allowed arbitrarily. Another restriction is the use of default constructor. For instance, if you write new T(), you will get a compilation error, because not all classes have a no-parameter constructor. What if you do want to create an object using new T(), or you want to use operators such as == and !=? You can, but first you have to constraint the type that can be used for the parameterized type. Let's look at how to do that.

Constraints and Their Benefits

A generic class allows you to write your class without committing to any type, yet allows the user of your class, later on, to indicate the specific type to be used. While this gives greater flexibility by placing some constraints on the types that may be used for the parameterized type, you gain some control in writing your class. Let's look at an example:

Example 5. The need for constraints: code that will not compile


public static T Max<T>(T op1, T op2) 
{
        if (op1.CompareTo(op2) < 0)
                return op1;
        return op2;
}

The code in Example 5 will produce a compilation error:


Error 1 'T' does not contain a definition for 'CompareTo'

Assume I need the type to support the CompareTo() method. I can specify this by using the constraint that the type specified for the parameterized type must implement the IComparable interface. Example 6 has the code:

Example 6. Specifying a constraint


public static T Max<T>(T op1, T op2) where T : IComparable
{
        if (op1.CompareTo(op2) < 0)
                return op1;
        return op2;
}

In Example 6, I have specified the constraint that the type used for parameterized type must inherit from (implement) IComparable. The following constraints may be used:


where T : struct          type must be a value type (a struct)
where T : class           type must be reference type (a class)
where T : new()           type must have a no-parameter constructor
where T : class_name      type may be either class_name or one of its
                          sub-classes (or is below class_name 
                          in the inheritance hierarchy)
where T : interface_name  type must implement the specified interface

You may specify a combination of constraints, as in: where T : IComparable, new(). This says that the type for the parameterized type must implement the IComparable interface and must have a no-parameter constructor.

Inheritance and Generics

A generic class that uses parameterized types, like MyClass1<T>, is called an open-constructed generic. A generic class that uses no parameterized types, like MyClass1<int>, is called a closed-constructed generic.

You may derive from a closed-constructed generic; that is, you may inherit a class named MyClass2 from another class named MyClass1, as in:


public class MyClass2<T> : MyClass1<int>

You may derive from an open-constructed generic, provided the type is parameterized. For example:


public class MyClass2<T> : MyClass2<T>

is valid, but


public class MyClass2<T> : MyClass2<Y>

is not valid, where Y is a parameterized type. Non-generic classes may derive from closed-constructed generic classes, but not from open-constructed generic classes. That is,


public class MyClass : MyClass1<int>

is valid, but


public class MyClass : MyClass1<T>

is not.

Generics and Substitutability

When we deal with inheritance, we need to be careful about substitutability. If B inherits from A, then anywhere an object of A is used, an object of B may also be used. Let's assume we have a Basket of Fruits (Basket<Fruit>). We have Apple and Banana (kinds of Fruits) inherit from Fruit. Should Basket of Apples (Basket<apple>) inherit from Basket of Fruits (Basket<Fruit>)? The answer is no, if we think about substitutability. Why? Consider a method that works with a Basket of Fruits:


public void Package(Basket<Fruit> aBasket)
{
        aBasket.Add(new Apple());
        aBasket.Add(new Banana());
}

If an instance of Basket<Fruit> is sent to this method, the method would add an Apple and a Banana. However, what would the effect be of sending an instance of a Basket<Apple> to this method? You see, this gets tricky. That is why if you write:


Basket<Apple> anAppleBasket = new Basket<Apple>();
Package(anAppleBasket);

You will get an error:


Error 2 Argument '1': 
cannot convert from 'TestApp.Basket<testapp.apple>' 
to 'TestApp.Basket<testapp.fruit>'

The compiler protects us from shooting ourselves in the foot by making sure we don't arbitrarily pass a collection of derived where a collection of base is expected. That is pretty good, isn't it?

Wait a minute, though! That was great in the above example, but there are times when I do want to pass a collection of derived where a collection of base is needed. For instance, consider an Animal (such as Monkey), which has a method named Eat that takes a Basket<Fruit>, as shown below:


public void Eat(Basket<Fruit> fruits)
{
        foreach (Fruit aFruit in fruits)
        {
                // code to eat fruit
        }
}

Now, you may call:


Basket<Fruit> fruitsBasket = new Basket<Fruit>();
… // Fruits added to Basket
anAnimal.Eat(fruitsBasket);

What if you have a Basket<Banana> with you? Would it make sense to send a Basket<Banana> to the Eat method? In this case, it would, no? But the compiler will give us an error if we try:


Basket<Banana> bananaBasket = new Basket<Banana>();
//…
anAnimal.Eat(bananaBasket);

The compiler is protecting us here. How can we ask the compiler to let us through in this particular case? Again, constraints come in handy for this:


public void Eat<t>(Basket<t> fruits) where T : Fruit
{
foreach (Fruit aFruit in fruits)
      {
        // code to eat fruit
}
}

In writing the Eat() method, I am asking the compiler to allow a Basket of any type T, where T is of the type Fruit or any class that inherits from Fruit.

Pages: 1, 2

Next Pagearrow