Tuesday, September 21, 2010

Enums in java

In prior releases, the standard way to represent an enumerated type was the int Enum pattern:
// int Enum Pattern - has severe problems!
public static final int SEASON_WINTER = 0;
public static final int SEASON_SPRING = 1;
public static final int SEASON_SUMMER = 2;
public static final int SEASON_FALL   = 3;
But these are not type safe, and clearly naming them is bit of a problem. So in Java 5, they introduced Enums.

Eg.
enum Season { WINTER, SPRING, SUMMER, FALL } 
public enum Rank { DEUCE, THREE, FOUR, FIVE, SIX,
        SEVEN, EIGHT, NINE, TEN, JACK, QUEEN, KING, ACE }

    public enum Suit { CLUBS, DIAMONDS, HEARTS, SPADES } 
  
Note on Semicolon ( ; )
Enum embedded inside a class. Outside the enclosing class, elements are referenced as Outter.Color.RED, Outter.Color.BLUE, etc.
public class Outter {
 public enum Color {
   WHITE, BLACK, RED, YELLOW, BLUE
 }
}
Enum that overrides toString method. A semicolon after the last element is required to be able to compile it. More details on overriding enum toString method can be found.
public enum Color {
 WHITE, BLACK, RED, YELLOW, BLUE;  //; is required here.

 @Override public String toString() {
   //only capitalize the first letter
   String s = super.toString();
   return s.substring(0, 1) + s.substring(1).toLowerCase();
 }
}
Iterating enum through Value method...

for ( Color c :Color.values() ) {
         System.out.print( c + " " );
}


Enum constants may act as case labels in switch 
 enum Grade A, B, C, D, F, INCOMPLETE };
class Student {

  private String firstName;
  private String lastName;
  private Grade grade;

  public Student(String firstName, String lastName) ;

//getters and setters of grade
 public void assignGrade(Grade grade) {
    this.grade = grade;
  }

  public Grade getGrade() {
    return grade;
  }
 

Now we can have switch block as :
switch (student1.getGrade()) {
      case A: 
        outputText.append(" excelled with a grade of A");
        break;   
      case B: // fall through to C
      case C: 
        outputText.append(" passed with a grade of ")
                  .append(student1.getGrade().toString());
        break;
      case D: // fall through to F
      case F:
        outputText.append(" failed with a grade of ")
                  .append(student1.getGrade().toString());
        break;
      case INCOMPLETE:
        outputText.append(" did not complete the class.");
        break;
      default:
        outputText.append(" has a grade of ")
                  .append(student1.getGrade().toString());
        break;
    }



Enum with additional fields and custom constructor. Enum constructors must be either private or package default, and protected or public access modifier is not allowed. When custom constructor is declared, all elements declaration must match that constructor.
public enum Color { //Color is of type enum
 WHITE(21), BLACK(22), RED(23), YELLOW(24), BLUE(25);

 private int code;

 private Color(int c) {
   code = c;
 }

 public int getCode() {
   return code;
 }
Enum that implements interfaces. Enum can implement any interfaces. All enum types implicitly implements java.io.Serializable, and java.lang.Comparable.
public enum Color implements Runnable {
 WHITE, BLACK, RED, YELLOW, BLUE;

 public void run() {
   System.out.println("name()=" + name() +
       ", toString()=" + toString());
 }
}
A sample test program to invoke this run() method:
for(Color c : Color.values()) {
 c.run();
}
Or,
for(Runnable r : Color.values()) {
 r.run();
}
Enum and their super-class
All java enum E implicitly extends java.lang.Enum. Since java doesn't allow multiple inheritance, enum types can't have superclass. They can't even extend from java.lang.Enum, nor java.lang.Object. It also means enum A can't inherit or extend enum B.

For example, the following is an invalid enum declaration:
public enum MyType extends Object {
ONE, TWO
}
Compiler error:
MyType.java:3: '{' expected
public enum MyType extends Object {
MyType.java:6:  expected
2 errors
 
The correct form should be:
public enum MyType {
ONE, TWO
}

Custom string values for enums
The default string value for java enum is its face value, or the element name. However, you can customize the string value by overriding toString() method. For example,
public enum MyType {
ONE {
    public String toString() {
        return "this is one";
    }
},

TWO {
    public String toString() {
        return "this is two";
    }
}
}
Running the following test code will produce this:
public class EnumTest {
public static void main(String[] args) {
    System.out.println(MyType.ONE);
    System.out.println(MyType.TWO);
}
}
-------------
this is one
this is two
Another interesting fact is, once you override toString() method, you in effect turn each element into an anonymous inner class. So after compiling the above enum class, you will see a long list of class files:
MyType.class
MyType$1.class
MyType$2.class

Saturday, September 18, 2010

Modifying Java Variables (w.r.t c and c++)

Modifying Simple Variable
The only mechanism for changing the value of a simple Java variable is an assignment statement. Java assignment syntax is identical to C assignment syntax. As in C, an assignment replaces the value of a variable named on the left- hand side of the equals sign by the value of the expression on the right- hand side of the equals sign.

Modifying Object Variable 
Java object variables can be changed in two ways. Like simple variables, you can make assignments to object variables. When this is done the object referenced by the variable is not changed. Instead, the reference is replaced by a reference to a different object.
With a few exceptions, the only other thing that you can do with an object variable is to send it a message. This is an important part of any Java program, allowing communication between objects.


Friday, September 17, 2010

Constructor in java 1

Constructors

When you create a new instance (a new object) of a class using the new keyword, a constructor for that class is called. Constructors are used to initialize the instance variables (fields) of an object. Constructors are similar to methods, but with some important differences.
  • Constructor name is class name. A constructors must have the same name as the class its in.
  • Default constructor. If you don't define a constructor for a class, a default parameterless constructor is automatically created by the compiler. The default constructor calls the default parent constructor (super()) and initializes all instance variables to default value (zero for numeric types, null for object references, and false for booleans).
  • Default constructor is created only if there are no constructors. If you define any constructor for your class, no default constructor is automatically created.
  • Differences between methods and constructors.
    • There is no return type given in a constructor signature (header). The value is this object itself so there is no need to indicate a return value.
    • There is no return statement in the body of the constructor.
    • The first line of a constructor must either be a call on another constructor in the same class (using this), or a call on the superclass constructor (using super). If the first line is neither of these, the compiler automatically inserts a call to the parameterless super class constructor.
    These differences in syntax between a constructor and method are sometimes hard to see when looking at the source. It would have been better to have had a keyword to clearly mark constructors as some languages do.
  • this(...) - Calls another constructor in same class. Often a constructor with few parameters will call a constructor with more parameters, giving default values for the missing parameters. Use this to call other constructors in the same class.
  • super(...). Use super to call a constructor in a parent class. Calling the constructor for the superclass must be the first statement in the body of a constructor. If you are satisfied with the default constructor in the superclass, there is no need to make a call to it because it will be supplied automatically.

Example of explicit this constructor call

public class Point {
    int m_x;
    int m_y;

    //============ Constructor
    public Point(int x, int y) {
        m_x = x;
        m_y = y;
    }

    //============ Parameterless default constructor
    public Point() {
        this(0, 0);  // Calls other constructor.
    }
    . . .
}

super(...) - The superclass (parent) constructor

An object has the fields of its own class plus all fields of its parent class, grandparent class, all the way up to the root class Object. It's necessary to initialize all fields, therefore all constructors must be called! The Java compiler automatically inserts the necessary constructor calls in the process of constructor chaining, or you can do it explicitly.
The Java compiler inserts a call to the parent constructor (super) if you don't have a constructor call as the first statement of you constructor. The following is the equivalent of the constuctor above.
//============ Constructor (same as in above example)
    public Point(int x, int y) {
        super();  // Automatically done if you don't call constructor here.
        m_x = x;
        m_y = y;
    }

Why you might want to call super explicitly

Normally, you won't need to call the constructor for your parent class because it's automatically generated, but there are two cases where this is necessary.
  1. You want to call a parent constructor which has parameters (the automatically generated super constructor call has no parameters).
  2. There is no parameterless parent constructor because only constructors with parameters are defined in the parent class.

Common naming convention : Coding Style

Variable names must be in mixed case starting with lower case. 
Common practice in the Java development community and also the naming convention for variables used by Sun for the Java core packages. Makes variables easy to distinguish from types, and effectively resolves potential naming collision as in the declaration
eg.
int state;

Names representing constants (final variables) must be all uppercase using underscore to separate words.
MAX_ITERATIONS, COLOR_RED
Common practice in the Java development community and also the naming convention used by Sun for the Java core packages.

In general, the use of such constants should be minimized. In many cases implementing the value as a method is a better choice:

int getMaxIterations() // NOT: MAX_ITERATIONS = 25
{
return 25;
}

This form is both easier to read, and it ensures a uniform interface towards class values.

Names representing methods must be verbs and written in mixed case starting with lower case. getName(), computeTotalWidth() 

Abbreviations and acronyms should not be uppercase when used as name.
exportHtmlSource(); // NOT: exportHTMLSource();
openDvdPlayer(); // NOT: openDVDPlayer();

Using all uppercase for the base name will give conflicts with the naming conventions given above. A variable of this type whould have to be named dVD, hTML etc. which obviously is not very readable. Another problem is illustrated in the examples above; When the name is connected to another, the readability is seriously reduced; The word following the acronym does not stand out as it should.

Private class variables should have underscore suffix.
class Person { 
               private String name_; 
... }
Apart from its name and its type, the scope of a variable is its most important feature. Indicating class scope by using underscore makes it easy to distinguish class variables from local scratch variables. This is important because class variables are considered to have higher significance than method variables, and should be treated with special care by the programmer.

A side effect of the underscore naming convention is that it nicely resolves the problem of finding reasonable variable names for setter methods:

void setName(String name)
{
name_ = name;
}

An issue is whether the underscore should be added as a prefix or as a suffix. Both practices are commonly used, but the latter is recommended because it seem to best preserve the readability of the name.

It should be noted that scope identification in variables have been a controversial issue for quite some time. It seems, though, that this practice now is gaining acceptance and that it is becoming more and more common as a convention in the professional development community.

Generic variables should have the same name as their type.
void setTopic(Topic topic) // NOT: void setTopic(Topic value) 
                                        // NOT: void setTopic(Topic aTopic) 
                                         // NOT: void setTopic(Topic t) 
void connect(Database database) // NOT: void connect(Database db) 
                                                   // NOT: void connect(Database oracleDB)
Reduce complexity by reducing the number of terms and names used. Also makes it easy to deduce the type given a variable name only.

If for some reason this convention doesn't seem to fit it is a strong indication that the type name is badly chosen.

Non-generic variables have a role. These variables can often be named by combining role and type:

Point startingPoint, centerPoint;
Name loginName;
All names should be written in English.English is the preferred language for international development.

Variables with a large scope should have long names, variables with a small scope can have short names
Scratch variables used for temporary storage or indices are best kept short. A programmer reading such variables should be able to assume that its value is not used outside a few lines of code. Common scratch variables for integers are i, j, k, m, n and for characters c and d.

The name of the object is implicit, and should be avoided in a method name.
line.getLength(); // NOT: line.getLineLength();
The latter might seem natural in the class declaration, but proves superfluous in use, as shown in the example.


The terms get/set must be used where an attribute is accessed directly.
employee.getName(); 
employee.setName(name); 
matrix.getElement(2, 4);
matrix.setElement(2, 4, value);

is prefix should be used for boolean variables and methods.
isSet, isVisible, isFinished, isFound, isOpen
This is the naming convention for boolean methods and variables used by Sun for the Java core packages.

Using the is prefix solves a common problem of choosing bad boolean names like status or flag. isStatus or isFlag simply doesn't fit, and the programmer is forced to chose more meaningful names.

Setter methods for boolean variables must have set prefix as in:

void setFound(boolean isFound);

There are a few alternatives to the is prefix that fits better in some situations. These are has, can and should prefixes:

boolean hasLicense();
boolean canEvaluate();
boolean shouldAbort = false;


The term compute can be used in methods where something is computed.
valueSet.computeAverage(); matrix.computeInverse()
Give the reader the immediate clue that this is a potential time consuming operation, and if used repeatedly, he might consider caching the result. Consistent use of the term enhances readability.


 Iterator variables should be called i, j, k etc.
for (Iterator i = points.iterator(); i.hasNext(); ) { : } for (int i = 0; i < nTables; i++) { : } 

The notation is taken from mathematics where it is an established convention for indicating iterators. Variables named j, k etc. should be used for nested loops only. 

Complement names must be used for complement entities 
get/set, add/remove, create/destroy, start/stop, insert/delete, increment/decrement, old/new, begin/end, first/last, up/down, min/max, next/previous, old/new, open/close, show/hide, suspend/resume, etc. 
Reduce complexity by symmetry. 

Abbreviations in names should be avoided
computeAverage(); // NOT: compAvg(); 
ActionEvent event; // NOT: ActionEvent e; 
catch (Exception exception) { // NOT: catch (Exception e) { 
There are two types of words to consider. First are the common words listed in a language dictionary. These must never be abbreviated.  
Never write
cmd instead of command 
comp instead of compute cp
instead of copy 
e instead of exception 
init instead of initialize
pt instead of point etc. 
Then there are domain specific phrases that are more naturally known through their acronym or abbreviations. These phrases should be kept abbreviated. Never write: HypertextMarkupLanguage instead of html CentralProcessingUnit instead of cpu PriceEarningRatio instead of pe etc. 

Negated boolean variable names must be avoided. 
bool isError; // NOT: isNoError 
bool isFound; // NOT: isNotFound 
The problem arise when the logical not operator is used and double negative arises. It is not immediately apparent what !isNotError means.

Associated constants (final variables) should be prefixed by a common type name.

final int COLOR_RED = 1; 
final int COLOR_GREEN = 2; 
final int COLOR_BLUE = 3; 
This indicates that the constants belong together, and what concept the constants represents. 
An alternative to this approach is to put the constants inside an interface effectively prefixing their names with the name of the interface: interface Color { final int RED = 1; final int GREEN = 2; final int BLUE = 3; }


Exception classes should be suffixed with Exception. 

class AccessException extends Exception { : } 
Exception classes are really not part of the main design of the program, and naming them like this makes them stand out relative to the other classes. This standard is followed by Sun in the basic Java library.


Default interface implementations can be prefixed by Default. 

class DefaultTableCellRenderer implements TableCellRenderer { : } 
It is not uncommon to create a simplistic class implementation of an interface providing default behaviour to the interface methods. The convention of prefixing these classes by Default has been adopted by Sun for the Java library.


Singleton classes should return their sole instance through method getInstance

class UnitManager { private final static UnitManager instance_ = new UnitManager(); private UnitManager() { ... } public static UnitManager getInstance() // NOT: get() or instance() or unitManager() etc. { return instance_; } } Common practice in the Java community though not consistently followed by Sun in the JDK. The above layout is the preferred pattern.

Classes that creates instances on behalf of others (factories) can do so through method new[ClassName]

class PointFactory { public Point newPoint(...) { ... } } 
Indicates that the instance is created by new inside the factory method and that the construct is a controlled replacement of new Point().

 Functions (methods returning an object) should be named after what they return and procedures (void methods) after what they do. Increase readability. Makes it clear what the unit should do and especially all the things it is not supposed to do. This again makes it easier to keep the code clean of side effects. 4 Files

Classes should be declared in individual files with the file name matching the class name. 

Secondary private classes can be declared as inner classes and reside in the file of the class they belong to. Enforced by the Java tools.

 File content must be kept within 80 columns. 80 columns is the common dimension for editors, terminal emulators, printers and debuggers, and files that are shared between several developers should keep within these constraints. It improves readability when unintentional line breaks are avoided when passing a file between programmers.

Special characters like TAB and page break must be avoided. 

 These characters are bound to cause problem for editors, printers, terminal emulators or debuggers when used in a multi-programmer, multi-platform environment.
  

The incompleteness of split lines must be made obvious 
totalSum = a + b + c + 
                   d + e; 
method(param1, param2, 
                  param3);
setText ("Long line split" +
                "into two parts."); 
 for (int tableNo = 0; tableNo < nTables; 
           tableNo += tableStep) { ... } 

Split lines occurs when a statement exceed the 80 column limit given above. It is difficult to give rigid rules for how lines should be split, but the examples above should give a general hint. In general: 
Break after a comma. 
Break after an operator. 
Align the new line with the beginning of the expression on the previous line. 5 Statements


Type conversions must always be done explicitly. Never rely on implicit type conversion.
floatValue = (int) intValue; // NOT: floatValue = intValue;
By this, the programmer indicates that he is aware of the different types involved and that the mix is intentional.
 

Variables should be initialized where they are declared and they should be declared in the smallest scope possible.
This ensures that variables are valid at any time. Sometimes it is impossible to initialize a variable to a valid value where it is declared. In these cases it should be left uninitialized rather than initialized to some phony value.

Improving coding style into classes

Class and Interface declarations should be organized in the following manner: 
1. Class/Interface documentation. 
2. class or interface statement. 
3. Class (static) variables in the order public, protected, package (no access modifier), private. 
4. Instance variables in the order public, protected, package (no access modifier), private.
5. Constructors. 
6. Methods (no specific order). Reduce complexity by making the location of each class element predictable. 


Imported classes should always be listed explicitly.
import java.util.List; // NOT: import java.util.*; 
import java.util.ArrayList; 
import java.util.HashSet; 
Importing classes explicitly gives an excellent documentation value for the class at hand and makes the class easier to comprehend and maintain. Appropriate tools should be used in order to always keep the import list minimal and up to date.

Improving coding style into functions or methods

Method modifiers should be given in the following order: static abstract synchronized final native
The modifier (if present) must be the first modifier.
public static double square(double a);
// NOT: static public double square(double a);
is one of public, protected or private while includes volatile and transient. The most important lesson here is to keep the access modifier as the first modifier. Of the possible modifiers, this is by far the most important, and it must stand out in the method declaration. For the other modifiers, the order is less important, but it make sense to have a fixed convention.
 

Specific cases of naming enhancing naming style

The term find can be used in methods where something is looked up.
vertex.findNearestVertex(); matrix.findSmallestElement(); node.findShortestPath(Node destinationNode);
Give the reader the immediate clue that this is a simple look up method with a minimum of computations involved. Consistent use of the term enhances readability.

The term initialize can be used where an object or a concept is established.
printer.initializeFontSet();
The American initializeshould be preferred over the English initialise. Abbreviation init must be avoided.




Plural form should be used on names representing a collection of objects.
Collection points; int[] values;
Enhances readability since the name gives the user an immediate clue of the type of the variable and the operations that can be performed on its elements.
 

n prefix should be used for variables representing a number of objects.
nPoints, nLines
The notation is taken from mathematics where it is an established convention for indicating a number of objects.

Note that Sun use num prefix in the core Java packages for such variables. This is probably meant as an abbreviation of number of, but as it looks more like number it makes the variable name strange and misleading. If "number of" is the preferred phrase, numberOf prefix can be used instead of just n. num prefix must not be used.

No suffix should be used for variables representing an entity number.
tableNo, employeeNo
The notation is taken from mathematics where it is an established convention for indicating an entity number.

An elegant alternative is to prefix such variables with an i: iTable, iEmployee. This effectively makes them named iterators.

Java specific naming convention

JFC (Java Swing) variables should be suffixed by the element type.
widthScale, nameTextField, leftScrollbar, mainPanel, fileToggle, minLabel, printerDialog
Enhances readability since the name gives the user an immediate clue of the type of the variable and thereby the available resources of the object.

Array specifiers must be attached to the type not the variable.
int[] a = new int[20]; // NOT: int a[] = new int[20]
The arrayness is a feature of the base type, not the variable. It is not known why Sun allows both forms.

Java source files should have the extension .java. Point.java Enforced by the Java tools.


The import statements must follow the package statement. import statements should be sorted with the most fundamental packages first, and grouped with associated packages together and one blank line between groups. 
 import java.io.IOException; 
import java.net.URL;
import java.rmi.RmiServer; 
import java.rmi.server.Server; 
import javax.swing.JPanel; 
import javax.swing.event.ActionEvent; 
import org.linux.apache.server.SoapServer; 
The import statement location is enforced by the Java language. The sorting makes it simple to browse the list when there are many imports, and it makes it easy to determine the dependiencies of the present package The grouping reduce complexity by collapsing related information into a common unit.


The package statement must be the first statement of the file.
All files should belong to a specific package. The package statement location is enforced by the Java language. Letting all files belong to an actual (rather than the Java default) package enforces Java language object oriented programming techniques. 

Thursday, September 16, 2010

Java and CPP - the differences and similarities

This list of similarities and differences is based heavily on The Java Language Environment, A White Paper by James Gosling and Henry McGilton http://java.sun.com/doc/language_environment/ and the soon-to-be published book, Thinking in Java by Bruce Eckel, http://www.EckelObjects.com/. At least these were the correct URLs at one point in time. Be aware, however, that the web is a dynamic environment and the URLs may change in the future.
Java does not support typedefs, defines, or a preprocessor. Without a preprocessor, there are no provisions for including header files.
Since Java does not have a preprocessor there is no concept of #define macros or manifest constants. However, the declaration of named constants is supported in Java through use of the final keyword.
Java does not support enums but, as mentioned above, does support named constants.
Java supports classes, but does not support structures or unions.
All stand-alone C++ programs require a function named main and can have numerous other functions, including both stand-alone functions and functions, which are members of a class. There are no stand-alone functions in Java. Instead, there are only functions that are members of a class, usually called methods. Global functions and global data are not allowed in Java.
All classes in Java ultimately inherit from the Object class. This is significantly different from C++ where it is possible to create inheritance trees that are completely unrelated to one another.
All function or method definitions in Java are contained within the class definition. To a C++ programmer, they may look like inline function definitions, but they aren't. Java doesn't allow the programmer to request that a function be made inline, at least not directly.
Both C++ and Java support class (static) methods or functions that can be called without the requirement to instantiate an object of the class.
The interface keyword in Java is used to create the equivalence of an abstract base class containing only method declarations and constants. No variable data members or method definitions are allowed. (True abstract base classes can also be created in Java.) The interface concept is not supported by C++.
Java does not support multiple inheritance. To some extent, the interface feature provides the desirable features of multiple inheritance to a Java program without some of the underlying problems.
While Java does not support multiple inheritance, single inheritance in Java is similar to C++, but the manner in which you implement inheritance differs significantly, especially with respect to the use of constructors in the inheritance chain.
In addition to the access specifiers applied to individual members of a class, C++ allows you to provide an additional access specifier when inheriting from a class. This latter concept is not supported by Java.
Java does not support the goto statement (but goto is a reserved word). However, it does support labeled break and continue statements, a feature not supported by C++. In certain restricted situations, labeled break and continue statements can be used where a goto statement might otherwise be used.
Java does not support operator overloading.
Java does not support automatic type conversions (except where guaranteed safe).
Unlike C++, Java has a String type, and objects of this type are immutable (cannot be modified). Quoted strings are automatically converted into String objects. Java also has a StringBuffer type. Objects of this type can be modified, and a variety of string manipulation methods are provided.
Unlike C++, Java provides true arrays as first-class objects. There is a length member, which tells you how big the array is. An exception is thrown if you attempt to access an array out of bounds. All arrays are instantiated in dynamic memory and assignment of one array to another is allowed. However, when you make such an assignment, you simply have two references to the same array. Changing the value of an element in the array using one of the references changes the value insofar as both references are concerned.
Unlike C++, having two "pointers" or references to the same object in dynamic memory is not necessarily a problem (but it can result in somewhat confusing results). In Java, dynamic memory is reclaimed automatically, but is not reclaimed until all references to that memory become NULL or cease to exist. Therefore, unlike in C++, the allocated dynamic memory cannot become invalid for as long as it is being referenced by any reference variable.
Java does not support pointers (at least it does not allow you to modify the address contained in a pointer or to perform pointer arithmetic). Much of the need for pointers was eliminated by providing types for arrays and strings. For example, the oft-used C++ declaration char* ptr needed to point to the first character in a C++ null-terminated "string" is not required in Java, because a string is a true object in Java.
A class definition in Java looks similar to a class definition in C++, but there is no closing semicolon. Also forward reference declarations that are sometimes required in C++ are not required in Java.
The scope resolution operator (::) required in C++ is not used in Java. The dot is used to construct all fully-qualified references. Also, since there are no pointers, the pointer operator (->) used in C++ is not required in Java.
In C++, static data members and functions are called using the name of the class and the name of the static member connected by the scope resolution operator. In Java, the dot is used for this purpose.
Like C++, Java has primitive types such as int, float, etc. Unlike C++, the size of each primitive type is the same regardless of the platform. There is no unsigned integer type in Java. Type checking and type requirements are much tighter in Java than in C++.
Unlike C++, Java provides a true boolean type.
Conditional expressions in Java must evaluate to boolean rather than to integer, as is the case in C++. Statements such as if(x+y)... are not allowed in Java because the conditional expression doesn't evaluate to a boolean.
The char type in C++ is an 8-bit type that maps to the ASCII (or extended ASCII) character set. The char type in Java is a 16-bit type and uses the Unicode character set (the Unicode values from 0 through 127 match the ASCII character set). For information on the Unicode character set see http://www.stonehand.com/unicode.html.
Unlike C++, the >> operator in Java is a "signed" right bit shift, inserting the sign bit into the vacated bit position. Java adds an operator that inserts zeros into the vacated bit positions.
C++ allows the instantiation of variables or objects of all types either at compile time in static memory or at run time using dynamic memory. However, Java requires all variables of primitive types to be instantiated at compile time, and requires all objects to be instantiated in dynamic memory at runtime. Wrapper classes are provided for all primitive types except byte and short to allow them to be instantiated as objects in dynamic memory at runtime if needed.
C++ requires that classes and functions be declared before they are used. This is not necessary in Java.
The "namespace" issues prevalent in C++ are handled in Java by including everything in a class, and collecting classes into packages.
C++ requires that you re-declare static data members outside the class. This is not required in Java.
In C++, unless you specifically initialize variables of primitive types, they will contain garbage. Although local variables of primitive types can be initialized in the declaration, primitive data members of a class cannot be initialized in the class definition in C++.
In Java, you can initialize primitive data members in the class definition. You can also initialize them in the constructor. If you fail to initialize them, they will be initialized to zero (or equivalent) automatically.
Like C++, Java supports constructors that may be overloaded. As in C++, if you fail to provide a constructor, a default constructor will be provided for you. If you provide a constructor, the default constructor is not provided automatically.
All objects in Java are passed by reference, eliminating the need for the copy constructor used in C++.
(In reality, all parameters are passed by value in Java.  However, passing a copy of a reference variable makes it possible for code in the receiving method to access the object referred to by the variable, and possibly to modify the contents of that object.  However, code in the receiving method cannot cause the original reference variable to refer to a different object.)
There are no destructors in Java. Unused memory is returned to the operating system by way of a garbage collector, which runs in a different thread from the main program. This leads to a whole host of subtle and extremely important differences between Java and C++.
Like C++, Java allows you to overload functions. However, default arguments are not supported by Java.
Unlike C++, Java does not support templates. Thus, there are no generic functions or classes.
Unlike C++, several "data structure" classes are contained in the "standard" version of Java. More specifically, they are contained in the standard class library that is distributed with the Java Development Kit (JDK). For example, the standard version of Java provides the containers Vector and Hashtable that can be used to contain any object through recognition that any object is an object of type Object. However, to use these containers, you must perform the appropriate upcasting and downcasting, which may lead to efficiency problems.
Multithreading is a standard feature of the Java language.
Although Java uses the same keywords as C++ for access control: private, public, and protected, the interpretation of these keywords is significantly different between Java and C++.
There is no virtual keyword in Java. All non-static methods always use dynamic binding, so the virtual keyword isn't needed for the same purpose that it is used in C++.
Java provides the final keyword that can be used to specify that a method cannot be overridden and that it can be statically bound. (The compiler may elect to make it inline in this case.)
The detailed implementation of the exception handling system in Java is significantly different from that in C++.
Unlike C++, Java does not support operator overloading. However, the (+) and (+=) operators are automatically overloaded to concatenate strings, and to convert other types to string in the process.
As in C++, Java applications can call functions written in another language. This is commonly referred to as native methods. However, applets cannot call native methods.
Unlike C++, Java has built-in support for program documentation. Specially written comments can be automatically stripped out using a separate program named javadoc to produce program documentation.
Generally Java is more robust than C++ due to the following:
  • Object handles (references) are automatically initialized to null.
  • Handles are checked before accessing, and exceptions are thrown in the event of problems.
  • You cannot access an array out of bounds.
  • Memory leaks are prevented by automatic garbage collection.

Wednesday, September 1, 2010

Types of constructors

1. Void constructors or default constructors
This has no parameters and  is must in case of dynamic allocation of objects.


2. Default parameter constructor
A default parameter is a function parameter that has a default value provided to it. If the user does not supply a value for this parameter, the default value will be used. If the user does supply a value for the default parameter, the user-supplied value is used.

3 Private constructors

4. Parametric constructor
It is good practice to try not to overload the constructors. It is best to declare only one constructor and give it default parameters wherever possible:



using namespace std;

#include <iostream>
class vector { public: double x; double y; vector (double a = 0, double b = 0) { x = a; y = b; } }; int main () { vector k; cout << "vector k: " << k.x << ", " << k.y << endl << endl; vector m (45, 2); cout << "vector m: " << m.x << ", " << m.y << endl << endl; vector p (3); cout << "vector p: " << p.x << ", " << p.y << endl << endl; return 0; } output: vector k: 0, 0 vector m: 45, 2 vector p: 3, 0

The stack and the heap

The memory a program uses is typically divided into four different areas:
  • The code area, where the compiled program sits in memory.
  • The globals area, where global variables are stored.
  • The heap, where dynamically allocated variables are allocated from.
  • The stack, where parameters and local variables are allocated from.
There isn’t really much to say about the first two areas. The heap and the stack are where most of the interesting stuff takes place, and those are the two that will be the focus of this section.
The heap
The heap (also known as the “free store”) is a large pool of memory used for dynamic allocation. In C++, when you use the new operator to allocate memory, this memory is assigned from the heap.
1int *pValue = new int; // pValue is assigned 4 bytes from the heap
2int *pArray = new int[10]; // pArray is assigned 40 bytes from the heap
Because the precise location of the memory allocated is not known in advance, the memory allocated has to be accessed indirectly — which is why new returns a pointer. You do not have to worry about the mechanics behind the process of how free memory is located and allocated to the user. However, it is worth knowing that sequential memory requests may not result in sequential memory addresses being allocated!
1int *pValue1 = new int;
2int *pValue2 = new int;
3// pValue1 and pValue2 may not have sequential addresses
When a dynamically allocated variable is deleted, the memory is “returned” to the heap and can then be reassigned as future allocation requests are received.
The heap has advantages and disadvantages:
1) Allocated memory stays allocated until it is specifically deallocated (beware memory leaks).
2) Dynamically allocated memory must be accessed through a pointer.
3) Because the heap is a big pool of memory, large arrays, structures, or classes should be allocated here.
The stack
The call stack (usually referred to as “the stack”) has a much more interesting role to play. Before we talk about the call stack, which refers to a particular portion of memory, let’s talk about what a stack is.
Consider a stack of plates in a cafeteria. Because each plate is heavy and they are stacked, you can really only do one of three things:
1) Look at the surface of the top plate
2) Take the top plate off the stack
3) Put a new plate on top of the stack
In computer programming, a stack is a container that holds other variables (much like an array). However, whereas an array lets you access and modify elements in any order you wish, a stack is more limited. The operations that can be performed on a stack are identical to the ones above:
1) Look at the top item on the stack (usually done via a function called top())
2) Take the top item off of the stack (done via a function called pop())
3) Put a new item on top of the stack (done via a function called push())
A stack is a last-in, first-out (LIFO) structure. The last item pushed onto the stack will be the first item popped off. If you put a new plate on top of the stack, anybody who takes a plate from the stack will take the plate you just pushed on first. Last on, first off. As items are pushed onto a stack, the stack grows larger — as items are popped off, the stack grows smaller.
The plate analogy is a pretty good analogy as to how the call stack works, but we can actually make an even better analogy. Consider a bunch of mailboxes, all stacked on top of each other. Each mailbox can only hold one item, and all mailboxes start out empty. Furthermore, each mailbox is nailed to the mailbox below it, so the number of mailboxes can not be changed. If we can’t change the number of mailboxes, how do we get a stack-like behavior?
First, we use a marker (like a post-it note) to keep track of where the bottom-most empty mailbox is. In the beginning, this will be the lowest mailbox. When we push an item onto our mailbox stack, we put it in the mailbox that is marked (which is the first empty mailbox), and move the marker up one mailbox. When we pop an item off the stack, we move the marker down one mailbox and remove the item from that mailbox. Anything below the marker is considered “on the stack”. Anything at the marker or above the marker is not on the stack.
This is almost exactly analogous to how the call stack works. The call stack is a fixed-size chunk of sequential memory addresses. The mailboxes are memory addresses, and the “items” are pieces of data (typically either variables or addreses). The “marker” is a register (a small piece of memory) in the CPU known as the stack pointer. The stack pointer keeps track of where the top of the stack currently is.
The only difference between our hypothetical mailbox stack and the call stack is that when we pop an item off the call stack, we don’t have to erase the memory (the equivalent of emptying the mailbox). We can just leave it to be overwritten by the next item pushed to that piece of memory. Because the stack pointer will be below that memory location, we know that memory location is not on the stack.
So what do we push onto our call stack? Parameters, local variables, and… function calls.
The stack in action
Because parameters and local variables essentially belong to a function, we really only need to consider what happens on the stack when we call a function. Here is the sequence of steps that takes place when a function is called:
  1. The address of the instruction beyond the function call is pushed onto the stack. This is how the CPU remembers where to go after the function returns.
  2. Room is made on the stack for the function’s return type. This is just a placeholder for now.
  3. The CPU jumps to the function’s code.
  4. The current top of the stack is held in a special pointer called the stack frame. Everything added to the stack after this point is considered “local” to the function.
  5. All function arguments are placed on the stack.
  6. The instructions inside of the function begin executing.
  7. Local variables are pushed onto the stack as they are defined.
When the function terminates, the following steps happen:
  1. The function’s return value is copied into the placeholder that was put on the stack for this purpose.
  2. Everything after the stack frame pointer is popped off. This destroys all local variables and arguments.
  3. The return value is popped off the stack and is assigned as the value of the function. If the value of the function isn’t assigned to anything, no assignment takes place, and the value is lost.
  4. The address of the next instruction to execute is popped off the stack, and the CPU resumes execution at that instruction.
Typically, it is not important to know all the details about how the call stack works. However, understanding that functions are effectively pushed on the stack when they are called and popped off when they return gives you the fundamentals needed to understand recursion, as well as some other concepts that are useful when debugging.
Stack overflow
The stack has a limited size, and consequently can only hold a limited amount of information. If the program tries to put too much information on the stack, stack overflow will result. Stack overflow happens when all the memory in the stack has been allocated — in that case, further allocations begin overflowing into other sections of memory.
Stack overflow is generally the result of allocating too many variables on the stack, and/or making too many nested function calls (where function A calls function B calls function C calls function D etc…) Overflowing the stack generally causes the program to crash.
Here is an example program that causes a stack overflow. You can run it on your system and watch it crash:
1int main()
2{
3    int nStack[100000000];
4    return 0;
5}
This program tries to allocate a huge array on the stack. Because the stack is not large enough to handle this array, the array allocation overflows into portions of memory the program is not allowed to use. Consequently, the program crashes.
The stack has advantages and disadvantages:
  • Memory allocated on the stack stays in scope as long as it is on the stack. It is destroyed when it is popped off the stack.
  • All memory allocated on the stack is known at compile time. Consequently, this memory can be accessed directly through a variable.
  • Because the stack is relatively small, it is generally not a good idea to do anything that eats up lots of stack space. This includes allocating large arrays, structures, and classes, as well as heavy recursion.

Word Length Frequency

// word_len_histo.cpp : reads words and lists distribution
//                      of word lengths.
// Fred Swartz, 2002-09-01

// This would be nice to turn into an OO program, where
// a class represented a distribution of values.
// Some elements which are globals here would turn into
// private member elements in the class (eg, valueCount).


//--- includes
#include <iostream>
#include <iomanip>
#include <cctype>
using namespace std;

//--- prototypes
void  countValue(int cnt);
float getAverage();

//--- constants
const int BINS = 21;  // how many numbers can be counted

//--- globals
int valueCount[BINS]; // bins used for counting each number
int totalChars = 0;   // total number of characters

//=========================================================== main
int main() {

    char c;              // input character
    int  wordLen = 0;    // 0 if not in word, else word length

    //--- Initialize counts to zero
    for (int i=0; i
        valueCount[i] = 0;
    }

    //--- Read chars in loop and decide if in a word or not.
    while (cin.get(c)) {
        if (isalpha(c)) { // letters are in words, so
            wordLen++;    // add one to the word length
        } else {
            countValue(wordLen); // end of word
            wordLen = 0;  // not in a word, set to zero
        }
    }
    countValue(wordLen);  // necessary if word ended in EOF

    //--- print the number of words of each length
    cout << "Why does this line disappear?" << endl;
    cout << "Word length    Frequency" << endl;
    for (int j=1; j
        cout << setw(6) << right << j << "       " 
             << setw(8) << right << valueCount[j] << endl;
    }

    //--- print average length
    cout << "\nAverage word length: " << getAverage() << endl;

    return 0;
}//end main


//==================================================== countValue
void countValue(int cnt) {
    if (cnt > 0) {
        // this must be the end of a word
        if (cnt > 20) {
            cnt = 20;  // longer than 20 counts as 20
        }
        valueCount[cnt]++; // count in correct bin
    }
    totalChars += cnt;
}//end countWord


//==================================================== getAverage
float getAverage() {
    int totalCount  = 0;

    for (int i=0; i
        totalCount  += valueCount[i];
    }
    if (totalCount > 0) {
        return (float)totalChars/totalCount;
    } else {
        return 0.0;
    }
}//end getAverage


Alternative approach
Suppose that we have a very short paragraph like this "Roses are red. Violets are blue. This verse doesn't rhyme. And neither does this one," how can we find and save all different word lengths and their frequencies of occurrence? For example, "red" is 3 letter long and there are a total of five letters of this length (are, red, are, and, one) in the paragraph. Then one of the items in our cache should be 3 and 5
Just like other problems where we have to keep track of the number of occurrences, we should use a hash table. The algorithm is like this:
public HashMap  countWordLengthFrequency(String[] paragraph)
{
    HashMap  frequencyTable = new HashMap();

    for (int i = 0; i < paragraph.length; i++)
    {
      if (!frequencyTable.containsKey(paragraph[i].length()))
        frequencyTable.put(paragraph[i].length(), 1);
      else
      {
        Integer count = frequencyTable.get(paragraph[i].length()) + 1;
        frequencyTable.put(paragraph[i].length(), count);
      }
    }

    return frequencyTable;
}
Explanation:we just loop through the words in the paragraph. For each word, we check to see if its length is already in the hash table. Therefore, there are two cases. 1) If the word's length is already in the table, we increase the frequency count by one. 2) Otherwise, we hash the length into the table and set the frequency to 1 because this is the first time this length is hashed into the table.
Obviously, the time complexity is O(n) because we have to check every word in the paragraph. Moreover, in the worst case, the space complexity is O(n). That's when every word in the paragraph has a different length than the other words. If you know any better algorithm for this problem

Taking input as string 1 - " C-String to Int "

Converting C-Strings to Integer

If you want to convert a C-string (zero-terminated array of chars) of digits, you can call one of the library functions to do this (good idea), or write something like the following (good exercise).

Character codes for digits

Every character is represented by a pattern of bits. These patterns can be thought of as integers. If your system uses ASCII (or any of the newer standards), the integer value of the code for '0' is 48, '1' is 49, etc. This knowledge is commonly used when converting character digits to their equivalent values.

Example function to convert C-strings to int

One of the problems to solve immediately is what to do with errors. Let's make this a bool function that returns true if we can convert the string (eg, no illegal characters), and false otherwise. We'll pass the value back in a reference parameter.

The code

//============================================== string2int
bool string2int(char* digit, int& result) {
   result = 0;

   //--- Convert each digit char and add into result.
   while (*digit >= '0' && *digit <='9') {
      result = (result * 10) + (*digit - '0');
      digit++;
   }

   //--- Check that there were no non-digits at end.
   if (*digit != 0) {
      return false;
   }

   return true;
}

Dynamic Allocation Issues

Memory leaks

A program that allocates memory, but doesn't free it, is said to have a memory leak. For a small amount of data this usually isn't a big problem. However, a program that runs for a long time repeatedly allocating memory without freeing it will eventually crash, often crashing the entire system.

Dangling pointers

Dangling or stale pointers are another source of problems. These are pointers to memory that has been deallocated. There's no problem in principle with leaving old pointers lying around, as long as they're never used. It's good practice to set pointer to NULL immediately after a delete so there is no possibility of using them again.

Dynamic allocation expansion policy

There is some (usually a very small percentage of the total cpu time) overhead in dynamically allocating memory. The policy on how big each expansion should be is important to efficiency. A common policy for array expansion is to double the size each time. One could, in principle, allocate one one extra element each time it the array was expanded, but this would be too inefficient. D Altho it is commonly used, there is no magic in doubling -- it might be more appropriate for a particular application to add a fixed increment.

Using stl sort

Never write your own sort! Use the the sort in the Standard Template Library (STL). The STL has sorts that are efficient and well tested.

Basic syntax for calling sort

When calling the STL sort, you need to pass two parameters: the address of the first element to sort, and the address of one past the last element to sort. The address is used for iterating across array elements. For other data structures (eg, a vector) you will have to do something a little different, but for arrays we can simply express the beginning and ending points with the array name and the addition of an integer. For example,
#include <iostream>
#include <algorithm>
using namespace std;

int main() {
    int a[7] = {23, 1, 33, -20, 6, 6, 9};
    
    sort(a, a+7);
    
    for (int i=0; i<7; i++) {
        cout << a[i] << " ";
    }
    
    return 0;
}
This prints the sorted values.
-20 1 6 6 9 23 33
Include header
The header must be included.
Using plus to compute the address of an array element
An array variable is the address of the first element (eg, a is the address of a[0]), and the address of any element may be computed by adding an integer to the address of the first element. In this example, a is the address of the first element, and a+7 is the address of the eighth element (ie, the address of a[7]). How can we use a+7 if last element of the array is a[6]? See below.
The element past the end
The sort() function requires the end to be indicated with the address of the element beyond the last element that is to be sorted. Even if there is no element in the array, the address of this hypothetical element can be computed. Don't worry, sort() never tries to reference data at that position, it just uses that address as a upper limit.

Sorting predefined types

There is no problem sorting any of the predefined types (eg, int, float, char, ...).

Sorting class (struct) types

If you define a new type using struct or class, sort() has no idea how to compare two values. For example, if a new Student class is defined, what would it mean to compare two elements?
Defined as classEquivalent struct declaration
class Student {
  public:
    int    id;
    string first_name;
    string last_name;
    float  gpa;
};
struct Student {
    int    id;
    string first_name;
    string last_name;
    float  gpa;
};
For the following discussion, class will be used instead of struct, but they are completely equivalent except that all members of a struct default to public.

Comparison is not defined by default for class objects

There are very few operators which are defined by default for user-defined classes. Assignment is defined, but none of the comparison operaters are defined. For example, if we defined two students, what would it mean to compare them?
Student betty;
Student bob;
. . .   // assign some values.
if (betty > bob) {  // ILLEGAL, ILLEGAL, ILLEGAL

sort() needs comparison and assignment

The STL sort() method needs to compare elements and assign them. It uses the less-than (<) operator to compare, but less-than isn't defined for user types. C++ will perform assignment between classes/structs, so for simple structs that don't do dynamic allocation that generally isn't a problem.

You must define less-than for sort()

Overloading less-than is fairly simple. For the Student class let's define the comparison to be for the gpa field. We could also define it to be for the id or the name or whatever just as easily.
bool operator<(const Student& a, const Student& b) {
    return a.score < b.score;
}
The keyword operator is prefixed to the operator and the two parameters are passed as const (they won't be changed) reference parameters. A bool value is returned. After this function is defined, the STL sort() function may be used.

Enum in c++

The problem: representing series of values

It is very common to have a series of values that need to be represented. For example, to simulate a traffic light requires representing three values (red, yellow, and green), but there is no built-in C++ color datatype.

Use integer values to represent colors, for example red as 0, yellow as 1, and green as 2. There is nothing "green" about the value 2, and it could just as easily be represented by some other number. However, it is common to start a series at zero and continue up by ones.

The danger of magic numbers

Use of these "magic" numbers in the source code makes the code unreadable. For example,
x = 1;
What does this do, assign the number one or the color yellow to x?

Use of numbers is also very error prone - it is easy to mistakenly use the wrong one and making changes to the numbers and making updates to all references is difficult.

Use names instead of numbers

A better solution is to create named constants for each of the values. By convention, these named constants are uppercase.
const int RED    = 0;
const int YELLOW = 1;
const int GREEN  = 2;
Now it's easy to distinguish between assignment of the number 1 and the color yellow.
int y;
int x;
. . .
y = 1;      // assigns the integer one
x = YELLOW; // assigns yellow (which happens to be 1).
There is still the problem that we declare x as an int altho it's a color.

The enum type declaration provides a solution

C++ uses the enum statement to assign sequential integer values to names and provide a type name for declaration.
enum TrafficLightColor {RED, YELLOW, GREEN};
. . .
int y;
TrafficLightColor x;
. . .
y = 1;
x = YELLOW;
The enum declaration creates a new integer type. By convention the first letter of an enum type should be in uppercase. The list of values follows, where the first name is assigned zero, the second 1, etc.

Type checking prevents some erroneous assignments

The compiler may issue an error message or warning if you try to assign one kind of enum to a different kind. It also allows some dangerous types of assignments.
enum TrafficLightColor {RED, YELLOW, GREEN};
enum Gender {MALE, FEMALE};
TrafficLightColor x;
int  i;
. . .
x = YELLOW; // good
i = x;      // Legal, but bad style.  Assigns the integer representation.
i = (int)x; // As above, explicit casting is better style.
x = (TrafficLightColor)2; // Legal, but very dangerous. No checking.

x = FEMALE; // BAD, Compiler may give error or warning.
x = 5;      // BAD, Compiler may give error or warning.
 

Setting enum values

It's possible to control the values that are assigned to each enum constant. If a value is assingned to a constant, each successive constant without a value is assigned a value one greater than the previous. enum Day {MON=1, TUE, WED, THU, FRI, SAT, SUN}; The value of MON is one, TUE is two, etc instead of starting at zero. Another use of specific values is to create sets. Explicitly setting the values to powers of two represents each as separate bit. These values can then manipulated using the bit operations (&, |, ^ and ~).
enum Day {MON=1, TUE=2, WED=4, THU=8, FRI=16, SAT=32, SUN=64}; const int WEEKDAY = MON+TUE+WED+THU+FRI; . . . Day today; // This will have one of the values in it. . . . if ((today & WEEKDAY) != 0) . . .

Enum I/O

I/O of enums uses their integer values, not their names. This is not what is desired normally, so extra programming is required on input and output to use the names instead of integer values. The extra work for enum I/O means that they are often not used for simple programs.

Other languages

Java will have type-safe enums in version 1.5. Currently it requires programmers to explicitly declare each name as a constant ints. C# provides enums with additional facilities, eg to get names and check values.

The enum keyword is used to create an enumerated type named name that consists of the elements in name-list. The var-list argument is optional, and can be used to create instances of the type along with the declaration. For example, the following code creates an enumerated type for colors:
enum ColorT {red, orange, yellow, green, blue, indigo, violet};
     ...
     ColorT c1 = indigo;
     if( c1 == indigo ) {
       cout << "c1 is indigo" << endl;
     }
In the above example, the effect of the enumeration is to introduce several new constants named red, orange, yellow, etc. By default, these constants are assigned consecutive integer values starting at zero. You can change the values of those constants, as shown by the next example:
enum ColorT { red = 10, blue = 15, green };
     ...
     ColorT c = green;
     cout << "c is " << c << endl;
When executed, the above code will display the following output:
c is 16
Note that the above examples will only work with C++ compilers. If you're working in regular C, you will need to specify the enum keyword whenever you create an instance of an enumerated type:
enum ColorT { red = 10, blue = 15, green };
     ...
     enum ColorT c = green;   /* note the additional enum keyword */
     printf( "c is %d\n", c );
Alternatively, add a typedef to bring C and C++ on par:
typedef enum ColorT { red = 10, blue = 15, green } ColorT;
     ...
     ColorT c = green;   /* no more additional enum keyword */
     printf( "c is %d\n", c );

 

Beginning with vi

Unsetting with vi
Eg. to put off ai, use
:set noai