WindowsDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


AddThis Social Bookmark Button

Ten Traps in C# for C++ Programmers

by Jesse Liberty, Jesse Liberty
08/16/2001

In a recent article for the July 2001 MSDN Magazine I talked about What You Need to Know to Move from C++ to C#. In that article I mentioned that the syntax of C# is very similar to that of C++, and that the difficult part of the transition was not the language itself, but getting comfortable with the managed environment of .NET and understanding the extensive .NET Framework.

Related Reading

Programming C#
By Jesse Liberty

I've begun to compile a list of the syntax differences that do exist between C++ and C# (the list is available on my Web site. Click on Books then navigate to Programming C# and click on the FAQ). As you might expect, most of the syntactic changes are small and nearly trivial. There are a few changes that are potential traps for the unwary C++ programmer, however, and this article will focus on the ten most dangerous.

Trap #1: Nondeterministic finalization and the C# destructor.

Almost certainly, the biggest difference in C# for most C++ programmers will be garbage collection. While this means you no longer have to worry about memory leaks and ensuring that pointers are deleted, you also give up precise control over when your objects will be destroyed. In fact, there is no explicit destructor in C#.

If you do control an unmanaged resource, however, you will need to explicitly free that resource when you are done with it. Implicit control over a resource is provided with a Finalize method (called a finalizer), which will be called by the garbage collector when your object is destroyed.


We asked Jesse how C# compares with other object-oriented programming languages and how C# fits into the .NET Framework in this oreilly.com interview.

The finalizer should only release unmanaged resources that are held by your object, and should not reference other objects. Note that if you have only managed references you do not need to and should not implement the Finalize method; you want this only for handling unmanaged resources. Because there is some cost to having a finalizer, you ought to implement this only on methods that require it (that is, methods that consume valuable unmanaged resources).

You must never call an object's Finalize method directly (except that you should call your base class's Finalize method in your own Finalize). The garbage collector will call Finalize for you.

C#'s destructor looks, syntactically much like a C++ destructor, but it is totally different. The C# destructor is simply a shortcut for declaring a Finalize method that chains up to its base class. Thus writing:

~MyClass()
{ 
   // do work here
}

is identical to writing

MyClass.Finalize()
{ 
   // do work here
   base.Finalize();
}

Trap #2: Finalize versus Dispose.

It is not legal to call a finalizer explicitly. Your Finalize method will be called by the garbage collector. If you do handle limited unmanaged resources (such as file handles) that you want to close and dispose of as quickly as possible, you ought to implement the IDisposable interface. This interface has one method, Dispose, which will perform your cleanup. Your clients are then responsible for calling your Dispose method explicitly. Dispose is the way for your clients to say "don't wait for Finalize to be called, do it right now."

If you provide a Dispose method, you ought to stop the garbage collector from calling Finalize on your object; since the clean up will be provided explicitly. To do so, you call the static method GC.SuppressFinalize, passing in this the pointer for your object. Your Finalize method can then call your Dispose method.

Thus you might write:

public void Dispose()
{
  // perform clean up

  // tell the GC not to finalize
  GC.SuppressFinalize(this);
}

public override void Finalize()
{
  Dispose(); 
  base.Finalize();
}

For some objects, you'd rather have your clients call Close (for example, Close makes more sense than Dispose for file objects). You can implement this by creating a private Dispose method and a public Close method, and have your Close method invoke Dispose.

Because you can not be certain that your client will call Dispose reliably, and because finalization is nondeterministic (you can't control when the GC will run), C# provides a Using statement which ensures that Dispose will be called at the earliest possible time. The idiom is to declare which objects you are using, and then to create a scope for these objects with curly braces. When the close brace is reached the Dispose method will be called on the object automatically:

using System.Drawing;
class Tester
{
   public static void Main()
   {
      using (Font theFont = new Font("Arial", 10.0f))
      {
         // use theFont

      }   // compiler will call Dispose on theFont

      Font anotherFont = new Font("Courier",12.0f);
	  
      using (anotherFont)
      {
         // use anotherFont

      }  // compiler calls Dispose on anotherFont

   }
 
}

In the first part of this example, the Font object is created within the Using statement. When the Using statement ends, Dispose is called on the Font object. In the second part of the example, a Font object is created outside of the Using statement. When you decide to use that font, you put it inside the Using statement and when that statement ends, once again Dispose is called.


Check out O'Reilly's new .NET Resource Center for the latest articles and books covering Microsoft's .NET and C# technologies.

The Using statement also protects you against unanticipated exceptions. No matter how control leaves the Using statement, Dispose is called. It is as if there were an implicit try-catch-finally block.

Trap #3: C# distinguishes between value types and reference types.

Like C++, C# is a strongly typed language, and like C++, C# divides types into two sets: intrinsic (built-in) types offered by the language, and user-defined types that are defined by the programmer.

In addition to intrinsic types and user-defined types, C# differentiates between value types and reference types. Value types hold their value on the stack, like variables in C++, unless they are embedded within a reference type. Reference type variables sit on the stack, but they hold the address of an object on the heap, much like pointers in C++. Value types are passed to methods by value (a copy is made) while reference types are effectively passed by reference.

Classes and interfaces create reference types, but note carefully (see trap #5) structs are value types as are all the intrinsic types.

Trap #4: Watch out for implicit boxing.

Boxing and unboxing are the processes that enable value types (e.g., integers) to be treated as reference types (objects). The value is "boxed" inside an object, and subsequently "unboxed" back to a value type. Every type in C#, including the intrinsic types, derive from Object and may be implicitly cast to an object. Boxing a value allocates an instance of Object and copies the value into the new object instance.

Boxing is implicit, so when you provide a value type where a reference is expected the value is implicitly boxed. Boxing brings some performance overhead, so avoid boxing where possible, especially in large collections.

To return the boxed object back to a value type you must explicitly unbox it. The unboxing occurs in two steps: Check the object instance to make sure it is a boxed value of the given value type. Copy the value from the instance to the value-type variable. In order for the unboxing to succeed, the object being unboxed must be a reference to an object that was created by boxing a value of the value type.

using System;
public class UnboxingTest 
{
   public static void Main() 
   {
      int i = 123;

      //Boxing
      object o = i;

      // unboxing (must be explicit)
      int j = (int) o;
      Console.WriteLine("j: {0}", j);
   }
}

If the object being unboxed is null or a reference to an object of a different type, an InvalidCastException is thrown.

Trap #5: Struct is very different in C#.

In C++ a struct is nearly identical to a class. In C++, the only difference is that a struct has public access as its default (rather than private) and its inheritance is public by default (again, rather than private). Some C++ programmers use structs as data-only objects, but that is a convention not supported by the language and discouraged by many object-oriented designers.

In C#, a struct is a simple user-defined type, a lightweight alternative that is quite different from a class. While structs do support properties, methods, fields, and operators, structs don't support inheritance or destructors.

More importantly, while a class is a reference type, a struct is a value type (see trap #3). Thus, structs are useful for representing objects that do not require reference semantics. Structs are somewhat more efficient in their use of memory in arrays, however they may be less efficient when used in collections. Collections expect references, and structs must be boxed (see trap #4). There is overhead in boxing and unboxing, and classes may be more efficient in large collections.

Trap #6: Virtual methods must be explicitly overridden.

In C# the programmer's decision to override a virtual method must be made explicit with the override keyword.

To see why this is useful, assume that a Window class is written by Company A, and that ListBox and RadioButton classes were written by programmers from Company B using a purchased copy of the Company A Window class as a base. The programmers in Company B have little or no control over the design of the Window class, including future changes that Company A might choose to make.

Now suppose that one of the programmers for Company B decides to add a Sort method to ListBox:

public class ListBox : Window
{
   public virtual void Sort() {"}
}

This presents no problems until Company A, the author of Window, releases version 2 of its Window class. It turns out that the programmers in Company A also added a Sort method public class Window:

public class Window
{
   // "
   public virtual void Sort() {"}
}

In C++ the new virtual Sort method in Windows would now act as a base method for the virtual Sort method in ListBox. The compiler would call the Sort method in ListBox when you intend to call the Sort in Window. In C# a virtual function is always considered to be the root of virtual dispatch, that is, once C# finds a virtual method, it looks no further up the inheritance hierarchy If a new virtual Sort function is introduced into Window the run-time behavior of ListBox is unchanged. When ListBox is compiled again, however, the compiler generates a warning:

"\class1.cs(54,24): warning CS0114: 'ListBox.Sort()' hides 
inherited member 'Window.Sort()'. 

To make the current member override that implementation, add the override keyword. Otherwise add the new keyword.

To remove the warning, the programmer must indicate what he intends. He can mark the ListBox Sort method new, to indicate that it is not an override of the virtual method in Window:

public class ListBox : Window
{
   public new virtual void Sort() {"}

This action removes the warning. If, on the other hand, the programmer does want to override the method in Window, he need only use the override keyword to make that intention explicit.


For more in-depth coverage of Microsoft's .NET Framework, visit the O'Reilly's .NET DevCenter.

Trap #7: You may not initialize in the header.

Initialization works differently in C# than it does in C++. Suppose you have a class Person with a private member variable age, and a derived class Employee with a private member variable salaryLevel. In C++ you might initialize salaryLevel in the initialization part of the Employee constructor, like this:

Employee::Employee(int theAge, int theSalaryLevel):
Person(theAge)               // initialize base
salaryLevel(theSalaryLevel)  // initialize member variable
{
   // body of constructor
}

This construct is not legal in C#. While you can still initialize the base, the initialization of the member variable as shown here would cause a compile error. You can, however, set the initial value for the member variable in C# when you declare it:

Class Employee : public Person
{
    // declarations here
    private salaryLevel = 3;     // initialization
}

Note also that you do not add a semi-colon after the class declaration, and that each member must have its access declared explicitly.

Trap #8: Boolean values do not convert to integers.

In C# Boolean values (true, false) do not equate to integer variables. Thus, you may not write:
if  ( someFuncWhichReturnsAValue() )

and count on the idea that if someFuncWhichReturnsAValue returns a zero it will evaluate false, otherwise true. The good news is that the old error of using assignment versus equality is no longer a problem. Thus, if you write:

if ( x = 5 )

you will get a compile error, since x = 5 evaluates to 5, which is not a Boolean value.

Trap #9: You may not "fall through" in switch statements.

In C# a switch statement may not "fall through" to the next statement if it does any work. Thus, while the following is legal in C++, it is not legal in C#:

switch (i)
{
   case 4:
      CallFuncOne();
   case 5: // error, no fall through
     CallSomeFunc();
}

To accomplish this, you need to use an explicit goto statement:

switch (i)
{
   case 4:
      CallFuncOne();
      goto case 5;
   case 5:
      CallSomeFunc();
}

If the case statement does not work (has no code within it) then you can fall through:

switch (i)
{
   case 4: // fall through
   case 5: // fall through
   case 6:
      CallSomeFunc();
}

Trap #10: C# Requires definite assignment.

C# imposes definite assignment, which requires that all variables be assigned a value before they are used. Thus, you can declare a variable without initializing it, but you may not pass it to a method until it has a value.

This raises a problem with values you create simply to pass them to a method by reference, to act as "out" parameters. For example, suppose you have a method that returns the current hour, minute and second. If you were to write:

   int theHour;
   int theMinute;
   int theSecond;
timeObject.GetTime( ref theHour, ref theMinute, ref theSecond) 

You would get a compile error for using theHour, theMinute, and theSecond without initializing them:

Use of unassigned local variable 'theHour'
Use of unassigned local variable 'theMinute' 
Use of unassigned local variable 'theSecond'

You can initialize them to zero or some other innocuous value to quiet the pesky compiler:

   int theHour = 0;
   int theMinute = 0;
   int theSecond = 0;
timeObject.GetTime( ref theHour, ref theMinute, ref theSecond)

But that is too silly for words. The entire point of these variables is to pass them by reference into GetTime where they'll be changed. To solve this problem, C# provides the out parameter modifier for this situation. The out modifier removes the requirement that a reference parameter be initialized. The parameters to GetTime, for example, provide no information to the method; they are simply a mechanism for getting information out of it. Thus, by marking all three as out parameters, you eliminate the need to initialize them outside the method. Out parameters must be assigned a value before the method they are passed in to returns. Here are the altered parameter declarations for GetTime:

  public void GetTime(out int h, out int m, out int s)
  {
      h = Hour;
      m = Minute;
      s = Second;
  }

and here is the new invocation of the GetTime method:

    timeObject.GetTime( out theHour, out theMinute, out theSecond);

Jesse Liberty is a computer consultant, trainer, and best-selling book author who specializes in .NET and Web development. His company, Liberty Associates, designs and builds Web and Windows applications and delivers intensive on-site seminars on C#, ASP.NET, and related technologies. He has been a distinguished software engineer at AT&T, vice president for technology development at CitiBank, and software architect at Xerox and PBS. Jesse provides support for his books at www.LibertyAssociates.com.


O'Reilly & Associates recently released (July 2001) Programmng C#.