Friday, April 28, 2006

Code Generation, Boilerplates and OO

Of late, I have been playing around with DAOs a little bit and have also been blogging extensively on the implementation aspects of them (e.g. see here and here). Of all the leading vendors that offer DAO implementation, CodeFutures' Firestorm/DAO offers automatic code generation for DAOs from the relational schema. Andy Grove, the CTO of CodeFutures claims in his weblog that the code which FireStorm generates is as efficient and maintenable as hand-written code. In fact he also has a separate blog entry on the debate of frameworks versus code generators, where he mentions about the complexity of maintaining the framework as one of the driving reasons for going to the code generator approach. As an example he mentions that
the real benefit of Code Automation is that if the underlying framework changes substantially (for example going from EJB 1.1 to EJB 2.0) then the code can be re-generated for the new framework.


I have never been a fan of automatic code generators, which always tend to produce tons of boilerplates, which could otherwise be refactored much more elegantly using OO frameworks. Here is what Rod Johnson has to say about automatic generation of DAOs
There shouldn't be vast amounts of boilerplate code in DAOs. ... Using JDBC abstraction layer is better than generating JDBC code from a maintenability perspective. If we see requirement for boring repetitive code, we should apply an OO solution and abstract it into a framework, rather than generate it and live with the resulting duplication.


In one of my earlier posts, I expressed the same concern regarding the level of abstraction that Firestorm generated DAOs provide - I still feel that the code could have been better reengineered had we employed an OO framework based solution. At Anshinsoft we have been working on generic DAOs and adopting a mixed strategy for DAO code generation. All repetitive codes are part of an OO framework, which offers base level abstractions for DAO operations. The generated code only contains the specifics for the particular tables as specializations of the base abstractions.

After going through Andy's weblogs, recently I downloaded a copy of the evaluation version of FireStorm/DAO and started playing with it. I generated some DAO code against a schema and came up with the following implementation of AuthorDao, which I replicate below. The generated DAO implementation class consists of 586 lines of code for a table having 3 columns! Multiply this with the number of tables that can be there in a typical enterprise application, and all jars begin to explode.



/*
 * This source file was generated by FireStorm/DAO 3.0.1
 * on 07-Apr-2006 at 10:13:32
 *
 * If you purchase a full license for FireStorm/DAO you can customize this file header.
 *
 * For more information please visit http://www.codefutures.com/products/firestorm
 */

package com.mycompany.myapp.jdbc;

import com.mycompany.myapp.dao.*;
import com.mycompany.myapp.factory.*;
import com.mycompany.myapp.dto.*;
import com.mycompany.myapp.exceptions.*;
import java.sql.Connection;
import java.sql.Types;
import java.util.Collection;
import org.apache.log4j.Logger;
import java.sql.PreparedStatement;
import java.sql.Statement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Time;
import java.util.List;
import java.util.Iterator;
import java.util.ArrayList;

public class AuthorDaoImpl extends AbstractDataAccessObject implements AuthorDao
{
  /**
   * The factory class for this DAO has two versions of the create() method - one that
takes no arguments and one that takes a Connection argument. If the Connection version
is chosen then the connection will be stored in this attribute and will be used by all
calls to this DAO, otherwise a new Connection will be allocated for each operation.
   */
  protected java.sql.Connection userConn;

  protected static final Logger logger = Logger.getLogger( AuthorDaoImpl.class );

  /**
   * All finder methods in this class use this SELECT constant to build their queries
   */
  protected final String SQL_SELECT = "SELECT ID, NAME, PASSWORD FROM " + getTableName() + "";

  /**
   * Finder methods will pass this value to the JDBC setMaxRows method
   */
  private int maxRows;

  /**
   * SQL INSERT statement for this table
   */
  protected final String SQL_INSERT = "INSERT INTO " + getTableName() + " ( ID, NAME, PASSWORD ) VALUES ( ?, ?, ? )";

  /**
   * SQL UPDATE statement for this table
   */
  protected final String SQL_UPDATE = "UPDATE " + getTableName() + " SET ID = ?, NAME = ?, PASSWORD = ? WHERE ID = ?";

  /**
   * SQL DELETE statement for this table
   */
  protected final String SQL_DELETE = "DELETE FROM " + getTableName() + " WHERE ID = ?";

  /**
   * Index of column ID
   */
  protected static final int COLUMN_ID = 1;

  /**
   * Index of column NAME
   */
  protected static final int COLUMN_NAME = 2;

  /**
   * Index of column PASSWORD
   */
  protected static final int COLUMN_PASSWORD = 3;

  /**
   * Number of columns
   */
  protected static final int NUMBER_OF_COLUMNS = 3;

  /**
   * Index of primary-key column ID
   */
  protected static final int PK_COLUMN_ID = 1;

  /**
   * Inserts a new row in the AUTHOR table.
   */
  public AuthorPk insert(Author dto) throws AuthorDaoException
  {
    long t1 = System.currentTimeMillis();
    // declare variables
    final boolean isConnSupplied = (userConn != null);
    Connection conn = null;
    PreparedStatement stmt = null;
    ResultSet rs = null;
    
    try {
      // get the user-specified connection or get a connection from the ResourceManager
      conn = isConnSupplied ? userConn : ResourceManager.getConnection();
    
      StringBuffer sql = new StringBuffer();
      sql.append( "INSERT INTO " + getTableName() + " (" );
      int modifiedCount = 0;
      if (dto.isIdModified()) {
        if (modifiedCount > 0) {
          sql.append( ", " );
        }
    
        sql.append( "ID" );
        modifiedCount++;
      }
    
      if (dto.isNameModified()) {
        if (modifiedCount > 0) {
          sql.append( ", " );
        }
    
        sql.append( "NAME" );
        modifiedCount++;
      }
    
      if (dto.isPasswordModified()) {
        if (modifiedCount > 0) {
          sql.append( ", " );
        }
    
        sql.append( "PASSWORD" );
        modifiedCount++;
      }
    
      if (modifiedCount==0) {
        // nothing to insert
        throw new IllegalStateException( "Nothing to insert" );
      }
    
      sql.append( ") VALUES (" );
      for (int i=0; i < modifiedCount; i++ ) {
        if (i>0) {
          sql.append( "," );
        }
    
        sql.append( "?" );
      }
    
      sql.append( ")" );
      stmt = conn.prepareStatement( sql.toString() );
      int index = 1;
      if (dto.isIdModified()) {
        stmt.setInt( index++, dto.getId() );
      }
    
      if (dto.isNameModified()) {
        stmt.setString( index++, dto.getName() );
      }
    
      if (dto.isPasswordModified()) {
        stmt.setString( index++, dto.getPassword() );
      }
    
      if (logger.isDebugEnabled()) {
        logger.debug( "Executing " + sql.toString() + " with values: " + dto);
      }
    
      int rows = stmt.executeUpdate();
      long t2 = System.currentTimeMillis();
      if (logger.isDebugEnabled()) {
        logger.debug( rows + " rows affected (" + (t2-t1) + " ms)");
      }
    
      return dto.createPk();
    }
    catch (SQLException _e) {
      logger.error( "SQLException: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "SQLException: " + _e.getMessage(), _e );
    }
    catch (Exception _e) {
      logger.error( "Exception: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "Exception: " + _e.getMessage(), _e );
    }
    finally {
      ResourceManager.close(stmt);
      if (!isConnSupplied) {
        ResourceManager.close(conn);
      }
    
    }
    
  }

  /**
   * Updates a single row in the AUTHOR table.
   */
  public void update(AuthorPk pk, Author dto) throws AuthorDaoException
  {
    long t1 = System.currentTimeMillis();
    // declare variables
    final boolean isConnSupplied = (userConn != null);
    Connection conn = null;
    PreparedStatement stmt = null;
    
    try {
      // get the user-specified connection or get a connection from the ResourceManager
      conn = isConnSupplied ? userConn : ResourceManager.getConnection();
    
      StringBuffer sql = new StringBuffer();
      sql.append( "UPDATE " + getTableName() + " SET " );
      boolean modified = false;
      if (dto.isIdModified()) {
        if (modified) {
          sql.append( ", " );
        }
    
        sql.append( "ID=?" );
        modified=true;
      }
    
      if (dto.isNameModified()) {
        if (modified) {
          sql.append( ", " );
        }
    
        sql.append( "NAME=?" );
        modified=true;
      }
    
      if (dto.isPasswordModified()) {
        if (modified) {
          sql.append( ", " );
        }
    
        sql.append( "PASSWORD=?" );
        modified=true;
      }
    
      if (!modified) {
        // nothing to update
        return;
      }
    
      sql.append( " WHERE ID=?" );
      if (logger.isDebugEnabled()) {
        logger.debug( "Executing " + sql.toString() + " with values: " + dto);
      }
    
      stmt = conn.prepareStatement( sql.toString() );
      int index = 1;
      if (dto.isIdModified()) {
        stmt.setInt( index++, dto.getId() );
      }
    
      if (dto.isNameModified()) {
        stmt.setString( index++, dto.getName() );
      }
    
      if (dto.isPasswordModified()) {
        stmt.setString( index++, dto.getPassword() );
      }
    
      stmt.setInt( index++, pk.getId() );
      int rows = stmt.executeUpdate();
      long t2 = System.currentTimeMillis();
      if (logger.isDebugEnabled()) {
        logger.debug( rows + " rows affected (" + (t2-t1) + " ms)");
      }
    
    }
    catch (SQLException _e) {
      logger.error( "SQLException: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "SQLException: " + _e.getMessage(), _e );
    }
    catch (Exception _e) {
      logger.error( "Exception: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "Exception: " + _e.getMessage(), _e );
    }
    finally {
      ResourceManager.close(stmt);
      if (!isConnSupplied) {
        ResourceManager.close(conn);
      }
    
    }
    
  }

  /**
   * Deletes a single row in the AUTHOR table.
   */
  public void delete(AuthorPk pk) throws AuthorDaoException
  {
    long t1 = System.currentTimeMillis();
    // declare variables
    final boolean isConnSupplied = (userConn != null);
    Connection conn = null;
    PreparedStatement stmt = null;
    
    try {
      // get the user-specified connection or get a connection from the ResourceManager
      conn = isConnSupplied ? userConn : ResourceManager.getConnection();
    
      if (logger.isDebugEnabled()) {
        logger.debug( "Executing " + SQL_DELETE + " with PK: " + pk);
      }
    
      stmt = conn.prepareStatement( SQL_DELETE );
      stmt.setInt( 1, pk.getId() );
      int rows = stmt.executeUpdate();
      long t2 = System.currentTimeMillis();
      if (logger.isDebugEnabled()) {
        logger.debug( rows + " rows affected (" + (t2-t1) + " ms)");
      }
    
    }
    catch (SQLException _e) {
      logger.error( "SQLException: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "SQLException: " + _e.getMessage(), _e );
    }
    catch (Exception _e) {
      logger.error( "Exception: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "Exception: " + _e.getMessage(), _e );
    }
    finally {
      ResourceManager.close(stmt);
      if (!isConnSupplied) {
        ResourceManager.close(conn);
      }
    
    }
    
  }

  /**
   * Returns the rows from the AUTHOR table that matches the specified primary-key value.
   */
  public Author findByPrimaryKey(AuthorPk pk) throws AuthorDaoException
  {
    return findByPrimaryKey( pk.getId() );
  }

  /**
   * Returns all rows from the AUTHOR table that match the criteria 'ID = :id'.
   */
  public Author findByPrimaryKey(int id) throws AuthorDaoException
  {
    Author ret[] = findByDynamicSelect( SQL_SELECT + " WHERE ID = ?", new Object[] { new Integer(id) } );
    return ret.length==0 ? null : ret[0];
  }

  /**
   * Returns all rows from the AUTHOR table that match the criteria ''.
   */
  public Author[] findAll() throws AuthorDaoException
  {
    return findByDynamicSelect( SQL_SELECT + " ORDER BY ID", null );
  }

  /**
   * Returns all rows from the AUTHOR table that match the criteria 'ID = :id'.
   */
  public Author[] findWhereIdEquals(int id) throws AuthorDaoException
  {
    return findByDynamicSelect( SQL_SELECT + " WHERE ID = ? ORDER BY ID", new Object[] { new Integer(id) } );
  }

  /**
   * Returns all rows from the AUTHOR table that match the criteria 'NAME = :name'.
   */
  public Author[] findWhereNameEquals(String name) throws AuthorDaoException
  {
    return findByDynamicSelect( SQL_SELECT + " WHERE NAME = ? ORDER BY NAME", new Object[] { name } );
  }

  /**
   * Returns all rows from the AUTHOR table that match the criteria 'PASSWORD = :password'.
   */
  public Author[] findWherePasswordEquals(String password) throws AuthorDaoException
  {
    return findByDynamicSelect( SQL_SELECT + " WHERE PASSWORD = ? ORDER BY PASSWORD", new Object[] { password } );
  }

  /**
   * Method 'AuthorDaoImpl'
   *
   */
  public AuthorDaoImpl()
  {
  }

  /**
   * Method 'AuthorDaoImpl'
   *
   * @param userConn
   */
  public AuthorDaoImpl(final java.sql.Connection userConn)
  {
    this.userConn = userConn;
  }

  /**
   * Sets the value of maxRows
   */
  public void setMaxRows(int maxRows)
  {
    this.maxRows = maxRows;
  }

  /**
   * Gets the value of maxRows
   */
  public int getMaxRows()
  {
    return maxRows;
  }

  /**
   * Method 'getTableName'
   *
   * @return String
   */
  public String getTableName()
  {
    return "AUTHOR";
  }

  /**
   * Fetches a single row from the result set
   */
  protected Author fetchSingleResult(ResultSet rs) throws SQLException
  {
    if (rs.next()) {
      Author dto = new Author();
      populateDto( dto, rs);
      return dto;
    } else {
      return null;
    }
    
  }

  /**
   * Fetches multiple rows from the result set
   */
  protected Author[] fetchMultiResults(ResultSet rs) throws SQLException
  {
    Collection resultList = new ArrayList();
    while (rs.next()) {
      Author dto = new Author();
      populateDto( dto, rs);
      resultList.add( dto );
    }
    
    Author ret[] = new Author[ resultList.size() ];
    resultList.toArray( ret );
    return ret;
  }

  /**
   * Populates a DTO with data from a ResultSet
   */
  protected void populateDto(Author dto, ResultSet rs) throws SQLException
  {
    dto.setId( rs.getInt( COLUMN_ID ) );
    dto.setName( rs.getString( COLUMN_NAME ) );
    dto.setPassword( rs.getString( COLUMN_PASSWORD ) );
  }

  /**
   * Returns all rows from the AUTHOR table that match the specified arbitrary SQL statement
   */
  public Author[] findByDynamicSelect(String sql, Object[] sqlParams) throws AuthorDaoException
  {
    // declare variables
    final boolean isConnSupplied = (userConn != null);
    Connection conn = null;
    PreparedStatement stmt = null;
    ResultSet rs = null;
    
    try {
      // get the user-specified connection or get a connection from the ResourceManager
      conn = isConnSupplied ? userConn : ResourceManager.getConnection();
    
      // construct the SQL statement
      final String SQL = sql;
    
    
      if (logger.isDebugEnabled()) {
        logger.debug( "Executing " + SQL);
      }
    
      // prepare statement
      stmt = conn.prepareStatement( SQL );
      stmt.setMaxRows( maxRows );
    
      // bind parameters
      for (int i=0; sqlParams!=null && i < sqlParams.length; i++ ) {
        stmt.setObject( i+1, sqlParams[i] );
      }
    
    
      rs = stmt.executeQuery();
    
      // fetch the results
      return fetchMultiResults(rs);
    }
    catch (SQLException _e) {
      logger.error( "SQLException: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "SQLException: " + _e.getMessage(), _e );
    }
    catch (Exception _e) {
      logger.error( "Exception: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "Exception: " + _e.getMessage(), _e );
    }
    finally {
      ResourceManager.close(rs);
      ResourceManager.close(stmt);
      if (!isConnSupplied) {
        ResourceManager.close(conn);
      }
    
    }
    
  }

  /**
   * Returns all rows from the AUTHOR table that match the specified arbitrary SQL statement
   */
  public Author[] findByDynamicWhere(String sql, Object[] sqlParams) throws AuthorDaoException
  {
    // declare variables
    final boolean isConnSupplied = (userConn != null);
    Connection conn = null;
    PreparedStatement stmt = null;
    ResultSet rs = null;
    
    try {
      // get the user-specified connection or get a connection from the ResourceManager
      conn = isConnSupplied ? userConn : ResourceManager.getConnection();
    
      // construct the SQL statement
      final String SQL = SQL_SELECT + " WHERE " + sql;
    
    
      if (logger.isDebugEnabled()) {
        logger.debug( "Executing " + SQL);
      }
    
      // prepare statement
      stmt = conn.prepareStatement( SQL );
      stmt.setMaxRows( maxRows );
    
      // bind parameters
      for (int i=0; sqlParams!=null && i < sqlParams.length; i++ ) {
        stmt.setObject( i+1, sqlParams[i] );
      }
    
    
      rs = stmt.executeQuery();
    
      // fetch the results
      return fetchMultiResults(rs);
    }
    catch (SQLException _e) {
      logger.error( "SQLException: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "SQLException: " + _e.getMessage(), _e );
    }
    catch (Exception _e) {
      logger.error( "Exception: " + _e.getMessage(), _e );
      throw new AuthorDaoException( "Exception: " + _e.getMessage(), _e );
    }
    finally {
      ResourceManager.close(rs);
      ResourceManager.close(stmt);
      if (!isConnSupplied) {
        ResourceManager.close(conn);
      }
    
    }
    
  }

}


To me, this is reams of boilerplate code with hardly any abstraction for the common SQL operations. Have a look at the method Author[] findByDynamicSelect(...), which gets duplicated for every DAO you generate. Apart from the return type and the exception class, everything is identical across all DAOs! This, I think, is a design smell. As Andy has mentioned in his blog, Code Generators don’t mind writing repetitive code, but all these repetitive stuff goes to swell the bottomline of my application. I have been working on a financials project which has 600 database tables - the amount of boilerplate code that will be generated with the above approach will be enough to stretch the existing jar size by at least 30%. I simply could not afford the decision.

Bye Bye Code Generators - Enter OO

In my earlier post, I had discussed about how generic DAOs have helped us shrinkwrap the core data access functionalities into generic base abstractions - every specific DAO implementation only contains the details of that particular database table. Have a look at the following code generated for a table Employee using our code generator and OO framework concoction :


package com.anshinsoft.pi.dao.app.dao;

import java.util.ArrayList;
import java.util.List;

import org.apache.commons.beanutils.BeanUtils;

import com.anshinsoft.pi.core.StringBufferSize;
import com.anshinsoft.pi.dao.DAOBase;
import com.anshinsoft.pi.dao.DAOImplBase;
import com.anshinsoft.pi.dao.DTOBase;
import com.anshinsoft.pi.pix.orm.DbUtils;
import com.anshinsoft.pi.pix.orm.ICriteria;
import com.anshinsoft.pi.pix.orm.NullCriteria;
import com.anshinsoft.pi.pix.orm.SimpleCriteria;


/**
 * The DAO implementation for {@link Employee} class.
 *


 *
 */
public class EmployeeDAO
    extends DAOBase {


  /**
   * Enum for column names.
   */
  public enum ColumnNames {

    PK,
    ID,
    NAME,
    BIRTH_DATE,
    JOINING_DATE,
    DESIGNATION,
    ADDRESS_PK,
    PASSPORT_NO,
    EXPERTISE;

    /**
     * Constructor.
     */
    ColumnNames() {
    }
  }


  /**
   * Constructor.
   *
   * @param impl the concrete implementation of {@link DAOImplBase}
   */
  public EmployeeDAO(DAOImplBase impl) {
    super(impl);
  }

  /**
   * Returns a list of the column names.
   * @return list of column names.
   */
  protected List getColumnNames() {
    ColumnNames[] names = ColumnNames.values();
    List l = new ArrayList(names.length);
    for (ColumnNames name : names) {
      l.add(name.name());
    }
    return l;
  }


  /**
   * Subclasses must override and provide the TABLE_NAME
   * that the bean is associated with.
   *
   * @return the table name.
   */
  public String getTableName() {
    return "EMPLOYEE";
  }


  /**
   * {@inheritDoc}.
   */
  protected ICriteria getPrimaryKeyWhereClause(T employee) {
    try {
      String str =
      BeanUtils.getProperty(employee,
        DbUtils.dbColumn2PropertyName(
        ColumnNames.PK.name()));
      return new SimpleCriteria(
        new StringBuilder(StringBufferSize.SMALL.size())
        .append(ColumnNames.PK.name())
        .append(" = ")
        .append(str).toString());
    } catch (Exception ex) {
      return NullCriteria.getInstance();
    }
  }
}



All mundane SQL operations are wrapped inside DAOBase<T>, all base level functionalities of transfer objects are encapsulated within DTOBase. In case the user wants to add additional behavior to the transfer objects, he can extend one out of DTOBase :

class MyDTOBase extends DTOBase {
// added behavior
}

And this way they can add global behaviors like state management functionalities to all DTOs that they use.

Monday, April 24, 2006

Scala Hosts a Friendly Visitor

GOF tells us that the intent of the Visitor design pattern is to
Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

The focus is on defining new operations, and not on allowing a flexible hierarchy of the elements themselves. In fact, the Consequences section of the Visitor pattern mentions that Visitor makes adding new operations easy and Adding new ConcreteElement classes is hard, since adding every new ConcreteElement forces a change in the Visitor interface and hence in all Visitor implementations. Thus the vanilla Visitor implementation in today's mostly used object oriented languages like Java and C++ violates the Open Closed Principle.

What's special in Scala ?

Scala offers three features which enable a Visitor implementation to have piecemeal growth. Both the visitor and the visited hierarchy can be enhanced incrementally resulting in flexible inheritance hierarchies for both of them.

The nuances of the Visitor implementation in Scala is part of the more general Expression Problem, which has been dscribed in the Scala context by Zenger and Odersky.

Let us review the above claim using an example. The following snippet illustrates a sample combination of the Visited and Visitor hierarchy. Here we have only one level of specialization for the Visited and the Visitor abstractions, along with the base contracts for each of them.

trait Base {

  trait Node { // Element base of the hierarchy
    def accept(v: visitor): unit;
  }

  // ConcreteElement specialization
  class File(name: String) extends Node {
    def accept(v: visitor): unit = v.visitFile(name);
  }

  type visitor <: Visitor;
  trait Visitor { // Visitor base of the hierarchy
    def visitFile(name: String): unit;
  }

  // ConcreteVisitor specialization
  trait PrintingVisitor requires visitor extends Visitor {
    def visitFile(name: String): unit =
    Console.println("printing file: " + name);
  }
}

Figure 1 - The Base

Let us see how the above implementation differs from the standard Java Visitor boilerplate. The main problem with the Java/C++ implementation is the fact that the visitor interface has to change with the addition of a ConcreteElement (the visited). This is where we violate the Open Closed Principle - we need to make invasive changes at the contract level, which forces a recursive change down the inheritance hierarchy for all concrete visitor implementations. The Scala implementation above addresses this problem by allowing us to abstract over the concrete visitor type. In order to keep the set of Visited classes open, the Abstract Type visitor is used. Every concrete implementation of Visitor interface such as PrintingVisitor above, implements its bound Visitor. And every Visited element uses the same abstract type visitor for the accept() methods. It is this magic combination of abstract type and mixin composition which allows us to enrich the dual hierarchies incrementally.

In the following sections we will look at how the Scala implementation allows a seamless adding of concrete members to the Visited as well as Visitor hierarchies.


Adding a ConcreteElement (the Visited hierarchy)

// data extension : add Link
trait BaseLink extends Base {

  type visitor <: Visitor;

  trait Visitor extends super.Visitor {
    def visitLink(subject: Node): unit;
  }

  class Link(subject: Node) extends Node {
    def accept(v: visitor): unit = v.visitLink(subject);
  }

  trait PrintingVisitor requires visitor
      extends super.PrintingVisitor
      with Visitor {
    def visitLink(subject: Node): unit =
      subject.accept(this);
  }
}

Figure 2 - Adding a ConcreteElement

Override the abstract type visitor with the extended Visitor trait and include the new visitLink() method - completely non-invasive, yet extensible! The concrete implementation of the new PrintingVisitor extends the trait from the superclass and compose with the extended Visitor trait using Scala mixin composition.

In the above, we claimed that the abstract type visitor allows us to abstract over the concrete visitor types. In the above method visitLink() of PrintingVisitor, the call to accept() has the argument this. But the type PrintingVisitor is not a subtype of the abstract visitor type visitor. The above implementation handles this by the statement requires visitor in the definition of the trait PrintingVisitor. This is called Self Type Annotation, which provides an alternative way of associating a class with an abstract type. The self type specified above is taken as the type of this within the class. Without this feature, the type of this is the usual type of the class itself. In the above implementation, the usage of the self type annotation ensures that the type of this, specified in the argument of accept() is of the abstract type visitor as demanded by the declaration of the Node trait.

Combining Independent Element Extensions

Similar to BaseLink extension above, we can have similar independent extensions in the Visited hierarchy. e.g. we can have a BaseDirectory trait which extends Base with Directory as another Node. I do not go into the details of this implementation, but it will be exactly along the same lines as BaseLink. The important part is the ability to combine independent Element extensions through composition. Scala's powerful modular mixin composition is the answer to all these as the following example demonstrates:

// compose all
trait BaseAll extends BaseDirectory with BaseLink {

  type visitor <: Visitor;
  trait Visitor extends super[BaseDirectory].Visitor
      with super[BaseLink].Visitor;

  trait PrintingVisitor requires visitor
      extends super[BaseDirectory].PrintingVisitor
      with super[BaseLink].PrintingVisitor
      with Visitor;
}

Figure 3 - Composing all Extensions

The abstraction above combines the two independent extensions through mixin composition. In case of such mixin composition, the Scala type system requires that abstract types have to be refined covariantly. Hence BaseAll has to redefine the bounds of the the abstract type visitor through explicit overriding, such that the bound of the new visitor is a subtype of both old bounds.

This is not all - we will see how adding new visitors in the hierarchy combines with the extensibility of the BaseAll trait, resulting in a complete implementation of the Visitor Design Pattern.


Adding a New Visitor to the Hierarchy

Adding new visitors is easy, as the pattern claims - every new operation added simply adds up to the Visitor hierarchy, implementing all the visitor interfaces.

// function extension
trait BaseExt extends BaseAll {
  class RemoveVisitor requires visitor extends Visitor {
    def visitFile(name: String): unit =
    Console.println("removing file: " + name);

  def visitDirectory(name: String): unit =
    Console.println("cannot remove directory: " + name);

  def visitLink(subject: Node): unit =
    subject.accept(this);
  }
}

Figure 4 - Extending the Visitor Interface


Compose Them All - The Grand Visitor

We now have the grand Visitor in place where we can add to Visited and Visitor hierarchies seamlessly ensuring piecemeal growth. The abstractions are open for extension but closed for invasive changes.

Here goes the final usage ..

object VisitorTest extends BaseExt with Application {
  type visitor = Visitor;

  val f = new File("foo.txt");
  val nodes = List(
    f,
    new Directory("bar"),
    new Link(f)
  );

  class PrintingVisitor extends super.PrintingVisitor;
  nodes.elements foreach { x => x.accept(new PrintingVisitor()); }

  class RemoveVisitor extends super.RemoveVisitor;
  nodes.elements foreach { x => x.accept(new RemoveVisitor()); }
}

Friday, April 21, 2006

Project Geometry

How do you ensure that the current Java project that you have been working on uses the correct exception hierarchy for the DAOs ? How do you identify the stamp of your organization when you look at the SVN check outs of a project ? How do you prevent reinventing the wheels of the maven script when you plan the build architecture of the project that you are supposed to manage ?

The answer to all of the above is to have a uniform Project Geometry for your organization. When I talk about the geometry, I mean only the software aspect of the project, the code that gets churned out by the software development team, the documentation artifacts that get generated in the process and the machinery that builds the code and deploys the binary to the desired target platform. The geometry ensures a uniformity not only in the look and feel of the project (the directory hierarchy, package structure, archetypes, build engine etc.), but also the innards of implementation which include the whole gamut from the design of the exception hierarchy down to the details of how the application interfaces with the external services layer. The following rumblings are some of my thoughts on what I mean when I talk about Project Geometry.

Software Reuse

In the article Four Dynamics for Bringing Use Back Into Software Reuse published in the Communications of the ACM, January 2006, Kevin C Desouza, Yukika Awazu and Amrit Tiwana identify three salient dynamics associated with the knowledge consumption lifecycle of a project - reuse, redesign and recode. They define
Reuse is the application of existing software artifacts as is; redesign is the act of altering existing software artifacts; and recoding is the discovery of new software artifacts through construction of software code or system designs.

In each of the above dynamics, there is an implicit assumption of pre-existence of software artifacts which finds place in the current lifecycle through a discovery process - either as-is or in a derived manifestation.

The question is : where from do we get these artifacts that can be reused ?

The Project Container

Every modern day IT organization who delivers software can have a Project Container, a meta-project which helps individual project teams to incubate new projects. The project container evangelizes the best practices for development, deployment and documentation and provides plug-ins and archetypes to kick-start a new project for the organization.

It should be as simple as 1-2-3 .. Let us consider a case study ..

For my organization, the build platform of choice for a Java based project is maven 2 and I should be able to generate a standard project skeleton structure from an archetype which is part of my project container. Here they go ..

  1. Download plugin for bootstrap from the project container repository

  2. Run maven install (mvn install ...)

  3. Setup project home

  4. Create archetype (mvn archetype:create -D... ...)


Boom .. we go .. my entire project hierarchy skeleton is ready with the corporate standard directory hierarchy, package naming conventions, documentation folders and (most importantly) a skeleton Project Object Model (POM) for my project. When I open up my IDE, I can find my prject already installed in the workspace! I can straightway start adding external dependencies to the pom.xml. Maven has really done wonders to the project engineering aspect through its concepts of archetypes, plugins and POMs. I can start defining my own project specific package hierarchy and write my own business logic.

My Project Structure

Any Java based project bootstrapped using the above project container of my organization bears the stamp of its identity. With its families of plug-ins and artifacts, the project container ensures a uniform geometry of all projects delivered from this place. It's really geometry in action - promoting reuse and uniformity of structure, thereby making life easier for all team members joining the project later in the lifecycle. Joel Spolsky talks about the Development Abstraction Layer as an illusion created by the management with its associated services which makes the programmer feel that a software company can be run only by writing code. In this context, the project container takes care of the engineering aspects of the project environment and presents to the programmer a feeling that the software that he delivers is only about writing the business logic. The other machineries like coding conventions (comes integrated with the IDE through container based checkstyles), build engine (again comes with the incubation process), documentation (maven based, comes free with the project container) and project portal (maven generated with container based customization) gets plugged in automatically as part of the bootstrapping process. The best thing is that the process is repeatable - every project based on a specific platform gets bootstrapped the same way with the same conventions replicated, resulting in a uniform project geometry.

Is This All ?

Actually project geometry is extensible to the limits you take it to. I can consider a standard infrastructure layer to be part of my project container for Java based projects. The exception hierarchy, standard utilities, a generic database layer, a generic messaging layer can all be part of the container.

But, what if I don't need'em all ?

You pay only for what you take. The project container repository gives you all options that it has to provide - pick and choose only the ones you need and set up dependencies in your POM. Remember, Maven 2 can handle transitive dependencies, one feature that we all have been crying for months.

The Sky is the Limit!

Taking it to the extremes - the project container can offer you options of implementation of some features if you base your code based on the container's contracts. This implies that the project container is not a mere engineering vehicle - it acts as a framework as well. Suppose in your Java application you need to have an Application Context for obvious reasons. You can design your application's context based upon the contracts exposed by your project container. And you can choose to select one of the many possible implementations during deployment - you can choose to use Spring's IoC container based context implementation or you can select the vanilla flat namespace based default implementation provided by the container itself. Whatever you do, you always honor the basic guideline, that of discovering from the project container and making it suitable for your use, and in the process maintaining the uniform geometry within your project.

Friday, April 14, 2006

Functional Programming for the Masses - Are We Ready For It ?

I have been talking a lot about Scala of late. Scala is a multiparadigm language offering both OO and functional programming features. This post is not about Scala though. It is about the latest trend in modern language design - the return of the functional programming in the mainstream. We have Ruby, Python, Scala all of them enriched with higher order functions, type inference, initializers, lambdas, meta-programming with expression trees and many other features which have their roots in functional programming languages like Lisp and Haskell.

But guess who is the leader ? The big daddy, of course. Microsoft has unleashed the LINQ framework which will be supported by upcoming versions of C# 3.0 and Visual Basic 9. Erik Meijer has hit it right on the nail in LtU - Functional Programming Has Reached The Masses - It's Called Visual Basic. In his confessions, Erik notes
After a long journey through theoretical computer science, database theory, functional programming, and scripting, abstract concepts such a monoids, lambda-expression, and comprehensions have finally reached the day to day world of ordinary programmers. The LINQ framework effectively introduces monad comprehensions into the upcoming versions of C# 3.0 and Visual Basic 9.


LINQ defines query operators for declarative traversal, filter, projection operations over abstract collections (based on IEnumerable<T>). The query interface extends seamlessly over both XML and SQL data as this overview document from Microsoft states
The query operators over XML (XLinq) use an efficient, easy-to-use in-memory XML facility to provide XPath/XQuery functionality in the host programming language. The query operators over relational data (DLinq) build on the integration of SQL-based schema definitions into the CLR type system. This integration provides strong typing over relational data while retaining the expressive power of the relational model and the performance of query evaluation directly in the underlying store.


In today's application development environment, we use Relational databases for persistence, Objects for implementing business logic and XML in the presentation tier - the infamous ROX triangle of Erik. By offering all of the three paradigms within the same framework, LINQ promises to unify all the three data models for the application developer. Under the hood its all monads and comprehensions, however, the application programmer uses his favourite SQL like syntax

Dim expr As IEnumerable(Of String) = _
    Select s.ToUpper() _
    From s in names _
    Where s.Length = 5 _
    Order By s


Along with .NET providing seamless functional programming capabilities uniformly over C# 3.0 and Visual Basic 9, modern languages like Ruby, Python and Scala also have lots to offer in the game. Programmers at large are getting to see the daylight of the elegance of functional programming through the monads, closures, meta programming and comprehensions. Despite the fact that each of these have been around for years, locked up in the functional programming community and discussed by a handful in LtU, they are expected to be part of mainstream programming very shortly. The question, however, is that, is the programming community ready for this ? In one of my earlier posts, I had expressed the concern over the lack of knowledge on the basic principles of functional languages like recursion, higher order functions, function currying, closures amongst the myriads of programmers coming out of CS schools. I think it's time the grad schools shell themselves out of their Java clothing and go back to the first principles that SICP has preached for ages.

Wednesday, April 12, 2006

Generics in Scala - Part 1

How big should a programming language be ? Well, that depends on what you would like to do with it. Common Lisp started out as a small elegant functional language, but with the proliferation of multiple dialects, the addition of object-oriented components (CLOS), generalized iteration construct (LOOP), full blown condition system etc., the current ANSI Common Lisp's specification is about 1,300 pages long and has 1,000 defined symbols [source: Patterns of Software - Gabriel].

Scala is a rich language targetted towards component based development - it offers a fusion of object oriented as well as the functional paradigms. The advanced type system of Scala backed by rich features like type inferencing and type annotations place Scala far ahead of its contemporaries. Scala's support of generics is the most advanced amongst all modern languages - it is possibly the only modern language that supports both parameterized types (originating from functional languages) and virtual types (from the OO paradigm).

Scala Parameterized Types Implemented through Erasure

This is similar to Java implementation where the generic type info gets erased after program compilation. Some efforts are on amongst the Scala designers towards reification of generics in Scala and have added an experimental support for interoperability with Java generics since Scala 2. Here is a simple example of a generic class in Scala :

object GenericsTest extends Application {

  abstract class Sum[t] {
    def add(x: t, y: t): t;
  }

  object intSum extends Sum[Int] {
    def add(x: Int, y: Int): Int = x + y;
  }

  class ListSum[a] extends Sum[List[a]] {
    def add(x : List[a], y : List[a]) : List[a] = x ::: y;
  }

  val s = new ListSum().add(List("a", "b"), List("c", "d"));
  val t = new ListSum().add(List(1, 2), List(3, 4));
  Console.println(s);
  Console.println(t);
}


Scala Parameterized Types Allow Variance Annotations

There is a strong relation between parameterized types and (co|contra|in)variance in type declarations / expressions. Java class declarations do not support any covariance annotations - List<Derived> is never a subtype of List<Base> in Java, even though Derived is a subtype of Base. However, Java allows covariance of generic types to be expressed by annotating every occurrence of List type to match the form List<? extends Base>. This implements a form of family polymorphism within all List types where the type argument is an arbitrary subtype of Base.

Scala allows variance annotations at class declaration (unlike Java). Consider the following declaration of an immutable Stack (adopted from Scala By Example) :

abstract class Stack[+a] {
  def push[b >: a](x: b): Stack[b] = new NonEmptyStack(x, this)
  def isEmpty: boolean
  def top: a
  def pop: Stack[a]
}

object EmptyStack extends Stack[Nothing] {
  def isEmpty = true
  def top = throw new Error("EmptyStack.top")
  def pop = throw new Error("EmptyStack.pop")
}

class NonEmptyStack[+a](elem: a, rest: Stack[a]) extends Stack[a] {
  def isEmpty = false
  def top = elem
  def pop = rest
}


The + in type argument for the class definition indicates that subtyping is covariant on that type parameter. A - in the same place changes the relationship to contravariance. The default (without any prefix) declaration indicates invariance of subtyping.

Along with variance annotations at class declarations, Scala's type checker verifies its soundness as well. Though safe for immutable data types, covariant subtyping is never safe for mutable data types, and the typechecker is quick to disallow such unsafe declarations. e.g. in the following declaration of a mutable Stack

class Stack[+T] {
  var elems: List[T] = Nil
  def push(x: T): Unit = elems = x :: elems
  def top: T = elems.head
  def pop: Unit = elems = elems.tail
}

The covariant argument for push() appears in a contravariant position - the Scala compiler will flag this as an error. Compare this approach with Java's covariant subtyping for arrays where we need to depend on runtime checks to ensure type safety of the underlying structure.

The question is how do we avoid a similar situation of push() method in the immutable version of the Stack definition. Immutable data structures are always candidates for covariant subtyping and Scala achieves this through specification of lower bounds for type parameters. Observe the contract def push[b >: a](x: b): Stack[b] in the functional version of the Stack definition above. In this contract the type parameter a does not appear in the contra variant position (parameter type of push()). The type parameter has been moved as a lower bound for another type parameter of a method, which is a covariant position. This lower bound of the type parameter makes it completely typesafe to implement covariant subtyping.


Subtyping with Generic Types - Scala, Java and C#

As I mentioned above, Scala supports covariant typing using annotations at class declaration, as opposed to Java, which allows annotations at type expressions. Thus Scala restricts annotations to the class designer, while Java leaves it to the users. The Scala designers found it difficult for users to maintain consistency of usage-site annotations as Odersky observes in his OOPSLA 05 paper on abstractions
In an earlier version of Scala we also experimented with usage-site variance annotations similar to wildcards. At first-sight, this scheme is attractive because of its extensibility. A single class might have covariant as well as non-variant fragments; the user chooses between the two by placing or omitting wildcards. However, this increased extensibility comes at price, since it is now the user of a class instead of its designer who has to make sure that variance annotations are used consistently.


However, Java implementation of generics achieves flexibility in typing through use of wildcards along with type-capture technique as detailed out in this wonderful exposition by Torgerson et. al. This paper is a strong justification of adding wildcard support in generics implementation and discusses all related issues of subtyping in the object oriented environment. Wild cards have added flexibility in Java typing in the sense that the user can now declare the same List class as covariant (List<? extends Number>) or contravariant (List<? super Number>). The other player to this game, Microsoft .NET does not use erasure and does not support wild cards in generics implementation - instead they have chosen to make the runtime support the implementation. Designers of C# do not consider subtyping to be an issue with generic types and do not feel wild card support an important offering. In his weblog Puzzling through Erasure II, Eric Gunnerson mentions that it is more important to be able to have generics over value types and performant implementations than to support wildcards.

In this entry I have discussed only one aspect of Scala's generics implementation - the parameterized types. In the next post, I will discuss the OO of generics - The Abstract Member Type which implements virtual types in Scala.

Friday, April 07, 2006

Java Persistence - Fun with Generic DAOs

Handling Java persistence through DAOs have really picked up, particularly among small to mid-sized application development space, who cannot afford to have a full fledged ORM implementation. In my last post on the subject of Java persistence, I projected ORM and DAO as cooperating technologies (rather the competing ones) and suggested that mid sized applications should use both to ensure proper layering and separation of concerns.

Firestorm/DAO from CodeFutures is one of the very popular products in the market today that generates all DAOs and POJOs once you feed in your schema for the database. The code generator from Firestorm/DAO generates one DAO interface and one implementation class for each database table along with separate classes for the DTOs, factories and the exceptions. Looking at the generated code, I somehow have the feeling that they could have been better engineered through employing a higher level of abstraction. For every DAO, the basic functionalities provided are the same; e.g. findByPrimaryKey() locates a record from the table using the passed in primary key; findAll() fetches all records from the table - yet we do not find any base class abstracting the commonality of SQL processing!

In the meantime, I have been doing some work outs on generic DAOs, with the idea of making the generated code more compact and better engineered without sacrificing an iota of typesafety. My last posting on this subject has some information on what I thought could be the base of my design - a Bridge based implementation of generic DAOs -

  • the interface hierarchy providing the contracts to be extended by application DAOs

  • the implementation hierarchy, providing options to implement the DAOs using different engines like JDBC, Hibernate etc.


Here's a sample usage for a table Employee for executing a query with a dynamic criterion :

// make the DAO
EmployeeDAO<Employee> e =
    DAOFactory.getInstance().getEmployeeDAO();

// make the criteria
ICriteria cri = new SimpleCriteria("employee_id = 10");

// get 'em
List<Employee> es = e.read(cri, Employee.class);


Note that in the above snippet, EmployeeDAO is a generic class and takes a POJO as the actual parameter of instantiation. EmployeeDAO has been generated as :

public class EmployeeDAO<T extends DTOBase> extends DAOBase<T> {
    ....
}


An alternative could have been to have the EmployeeDAO as a concrete class with the POJO Employee hardwired :

public class EmployeeDAO extends DAOBase<Employee> {
    ....
}


Well, it really depends upon how much flexibility you would want to give to your users. With the former approach, the user is allowed to construct the EmployeeDAO with any other POJO, so long the POJO contains the properties that match the database column names. This is often useful when the user wants to work with database views in the application, which has different backing POJOs than the ones associated with the main tables.

The abstraction DTOBase provides a suitable container for dumping all stuff common to all DTOs, e.g. bean state management functionalities (whether it has been changed since its last fetch from database) are very powerful candidates for this placeholder. Incidentally these functionalities are scattered throughout all POJOs in Firestorm/DAO generated codes.


Projection on a Table is also a DAO!

With the above code snippet for query, we can read entire records from the Employee table to the List<Employee>. Now what about projections - I want to select a few columns of the table into my POJO. The point to note is that the end result of a Projection is no different than the above read(), except the fact that we now have a smaller list of columns to deal with. It's no different from an operation on a DAO, which transforms the end result to contain a subset of the column list.

Think Decorator! Decorate a DAO to get the Projection :

public class Projection<T extends DTOBase> extends DAOBase<T> {
    private DAOBase<T> base;
    ...
}


And now the usage :

// set up the Projection
Projection<Employee> ep =
  DAOFactory.getInstance()
    .getEmployeeProjection()
    .setColumn(EmployeeDAO.ColumnNames.EMPLOYEE_ID)
    .setColumn(EmployeeDAO.ColumnNames.EMPLOYEE_NAME);

// get 'em
List<Employee> es = ep.read(cri, Employee.class);


Cool stuff ! We have the encapsulation of the SQL married to the typesafety of operations. The user does not have to write SQL statements or use hardcoded column names that can be checked only during runtime. That was the promise of the DAO Design Pattern - right ?


How about Joins ?

Joins are nothing but compositions of multiple tables based on primary key / foreign relationship. The result of a Join is also a DAO!

public class Join<T extends DTOBase> extends DAOBase<R> {
  public Join<T> addJoinParticipant(DAOBase<? extends DTOBase> dao){
  ...
  }
}


Compose as you Wish

// make the DAO
EmployeeDAO<Employee> e =
DAOFactory.getInstance().getEmployeeDAO();

// make the DAO
EmpSalaryDAO<EmpSalary> s =
DAOFactory.getInstance().getEmpSalaryDAO();

// make the criteria
ICriteria cri = new SimpleCriteria(
new StringBuilder(128)
    .append(EmployeeDAO.ColumnNames.EMPLOYEE_ID.name())
    .append(" = 10").toString());

Join<EmpSalaryDetails> d =
    DAOFactory.getInstance()
      .getEmpSalaryDetailsDAO()
        .addJoinParticipant(e)
        .addJoinParticipant(s)
        .addLeftColumn(EmployeeDAO.ColumnNames.EMPLOYEE_ID)
        .addRightColumn(EmpSalaryDAO.ColumnNames.EMPLOYEE_ID);

Projection<EmpSalaryDetails> esd =
    new Projection<EmpSalaryDetails>(
        new DAOJDBCImpl<EmpSalaryDetails>(),
        d);

// get 'em
List<EmpSalaryDetails> es = esd.read(cri, EmpSalaryDetails.class);

Thursday, April 06, 2006

Conquest of the Arithmetic Progression

May June issue of The American Scientist contains an essay by Brian Hayes, where he takes us through a number of anecdotes and references describing all variations of the popular story of Carl Friedrich Gauss conquering the world of arithmetic progression as a little school kid. The essay begins with Brian's own version of the story
In the 1780s a provincial German schoolmaster gave his class the tedious assignment of summing the first 100 integers. The teacher's aim was to keep the kids quiet for half an hour, but one young pupil almost immediately produced an answer: 1 + 2 + 3 + ... + 98 + 99 + 100 = 5,050. The smart aleck was Carl Friedrich Gauss, who would go on to join the short list of candidates for greatest mathematician ever. Gauss was not a calculating prodigy who added up all those numbers in his head. He had a deeper insight: If you "fold" the series of numbers in the middle and add them in pairs—1 + 100, 2 + 99, 3 + 98, and so on—all the pairs sum to 101. There are 50 such pairs, and so the grand total is simply 50×101. The more general formula, for a list of consecutive numbers from 1 through n, is n(n + 1)/2.


And then he digs deep into the research that he did to collect all versions of the famous anecdote. A fascinating read ..

Saturday, April 01, 2006

Scala: Compose Classes with Mixins

Scala is one of the most pure object-oriented languages with an advanced type system. It supports data definition and abstraction through class hierarchies along with efficient means for object composition and aggregation. In this post, we will go through some of the composition mechanisms supported by Scala, which make construction of reusable components much easier than Java.

However, before going into the details of all mechanisms, let us look at how Scala differentiates itself in expressing the basic construct of object oriented programming - the class definition. The following example is adapted from Ted Neward (with some changes) :

class Person(ln : String, fn : String, s : Person) {
  def lastName = ln;
  def firstName = fn;
  def spouse = s;

  def this(ln : String, fn : String) = { this(ln, fn, null); }

  private def isMarried() = {spouse != null;}

  def computeTax(income: double) : double = {
      if (isMarried()) income * 0.20
      else income * 0.30
  }

  override def toString() =
      "Hi, my name is " + firstName + " " + lastName +
      (if (spouse != null) " and this is my spouse, " + spouse.firstName + " " + spouse.lastName + "." else ".");
}

Let us quickly browse through the salient features of the above type definition where Scala differs from traditional OO languages like Java.

  • The main highlight is the conciseness or brevity as Ted Neward points out in his blog, where he also does an interesting comparison of the brevity of class definitions in Java, C#, VB, Ruby and Scala. Try writing the same definition in any of the mentioned languages - you will get the point straight through!

  • The class declaration itself takes the arguments for the primary constructor. The member definitions in the next three lines take care of the rest of the initialization. As a programmer, I have to type less - this is cool stuff! The constructor for an unmarried Person is defined as a separate constructor def this(...) ...

  • Like Ruby, Scala provides accessor methods directly around the fields. Being a functional language also, Scala does not provide default mutator semantics in the class definition. In order to mutate a field, you have to do it explicitly through state mutator methods.

  • Any class in Scala with an unspecified parent inherits from AnyRef, which is mapped to java.lang.Object in the Java implementation. The overriding of a method from the parent is explicitly decorated with the keyword override - the rest of the inheritance semantics is same as Java.


Class Composition in Scala

The rich type system of Scala offers powerful methods of class composition, which are significant improvements over that provided by popular OO languages like Java. Modular mixin composition of Scala offers a fine-grained mechanism for composing classes for designing reusable components, without the problems that techniques like multiple inheritance bring on board.

For effective class composition, Java offers "interfaces" in addition to single inheritance, while C++ offers direct "multiple inheritance" through classes. The problem with Java interfaces is that they contain only the method signatures, while the implementer of the interface has to provide implementation for all the methods that belong to the interface. Hence factoring out common implementations as reusable artifacts is not possible. OTOH, multiple inheritance in C++ offers direct and symmetric composition of classes without introducing any additional construct. But multiple inheritance brings with it the dreaded diamond syndrome - name conflicts and ambiguities. While this can be controlled by using virtual inheritance, still factoring out reusable functionalities in the form of composable abstractions is clumsy, at best, in C++. The idiom of using generic superclasses popularized by Van Hilst and Notkin, to implement mixins, provides a solution, but still this does not make mixins a part of the C++ typesystem. For more details, check this excellent treatment on the subject.

Scala's mixin-class composition allows programmers to provide default implementations at the mixin level itself, thereby reusing the delta of a class definition, i.e., all new definitions that are not inherited. Mixins in Scala are similar to Java interfaces in that they can be used to define contracts / signatures for subtypes to implement. The difference is that the programmer is allowed to provide default implementations for some of the methods which are parts of the mixin. While this may appear similar to what abstract classes provide in Java, but mixins can be used as a vehicle to implement multiple inheritance as well (which Java abstract classes do not offer). And finally, mixins, unlike classes may not have constructor parameters.

To define a class that will be used as a mixin for composition, Scala uses the keyword trait.

// defining a trait in Scala
trait Emptiness {
    def isEmpty: Boolean = size == 0;
    def size: int;
}

In order to compose using traits, Scala offers the keyword with :

// compose using traits
class IntSet extends AbstractSet with Emptiness with Guard {
    ....
}

The following are the main characteristics of trait types in Scala :

  • Trait classes do not have any independent existence - they always exist tagged along with some other classes. However they provide a seamless way to add behaviours to existing classes while allowing default implementations to be defined at the base level.

  • Nierstrasz et. al. observes
    An interesting feature of Scala is that traits cannot only be composed but can also be inherited, which is a consequence of the fact that Scala traits are just special classes. This means that both classes and traits can be defined as an extension of other traits. For example, Scala allows one to define a trait B that inherits from a trait A and uses the two traits U and V :

  • trait B extends A with U with V {
        ...
    }


  • In Scala, traits are first class citizens of the type system, in the sense that they are as much part of the type system as are classes. Traits, like classes introduce a new type and can be seamlessly integrated with the generics implementation of Scala. But that is for another day's story.


In order to get a feel of the power of mixin-composition in Scala using traits in the real world, here's an example from Odersky :

// define an abstract class for the iterator construct

abstract class AbsIterator {
    type T;
    def hasNext: boolean;
    def next: T;
}

// enrich iterators with an additional construct foreach

trait RichIterator extends AbsIterator {
    def foreach(f: T => unit): unit =
    while (hasNext) f(next);
}

// yet another iterator for Strings
// provides concrete implementations for all abstract
// methods inherited

class StringIterator(s: String) extends AbsIterator {
    type T = char;
    private var i = 0;
    def hasNext = i < s.length();
    def next = { val x = s.charAt(i); i = i + 1; x }
}

// usage
// composition of StringIterator with RichIterator
// Using class inheritance along with trait inheritance
// without considerations for multiple definitions
// : only delta is inherited

object Test {
    def main(args: Array[String]): unit = {
    class Iter extends StringIterator(args(0))
        with RichIterator;
    val iter = new Iter;
    iter foreach System.out.println
    }
}

Mixins in Ruby - Compared

Going thru the above piece, astute readers must have been wondering about how the mixin class composition in Scala stands with respect to some of the other implementations in modern programming languages. Let me end this post with a small comparison of the similar feature as offered by Ruby.

Mixin implementation in Ruby is offered through modules. Define a module in Ruby, include it within your class - and you have access to all instance methods of the module. It's as simple as that - when it's Ruby, simplicity is the essence.

// define a module
module Debug
    def who_am_i?
    "#{self.class.name} (\##{self.object_id}): #{self.to_s}"
    end
end

// include in class
class Phonograph
    include Debug
    # ...
end

class EightTrack
    include Debug
    # ...
end

// get access to instance methods
ph = Phonograph.new("West End Blues")
et = EightTrack.new("Surrealistic Pillow")
ph.who_am_i? // "Phonograph (#937328): West End Blues"
et.who_am_i? // "EightTrack (#937308): Surrealistic Pillow"

But, modules in Ruby have its own clumsiness when it comes to defining instance variables within the mixin and resolving ambiguities in names. Moreover, modules are not strictly part of Ruby's type system - Scala makes it more elegant by seamlessly integrating the trait with its class based type system. The moment you achieve this, generics based mixins come for free. And that will be one my topics for the next installment - another powerful abstraction that makes component design easier in Scala - The Abstract Type Member.