How to think about OO

by Miško Hevery

Everyone seems to think that they are writing OO after all they are using OO languages such as Java, Python or Ruby. But if you exam the code it is often procedural in nature.

Static Methods

Static methods are procedural in nature and they have no place in OO world. I can already hear the screams, so let me explain why, but first we need to agree that global variables and state is evil. If you agree with previous statement than for a static method to do something interesting it needs to have some arguments, otherwise it will always return a constant. Call to a staticMethod() must always return the same thing, if there is no global state. (Time and random, has global state, so that does not count and object instantiation may have different instance but the object graph will be wired the same way.)

This means that for a static method to do something interesting it needs to have arguments. But in that case I will argue that the method simply belongs on one of its arguments. Example: Math.abs(-3) should really be -3.abs(). Now that does not imply that -3 needs to be object, only that the compiler needs to do the magic on my behalf, which BTW, Ruby got right. If you have multiple arguments you should choose the argument with which method interacts the most.

But most justifications for static methods argue that they are "utility methods". Let's say that you want to have toCamelCase() method to convert string "my_workspace" to "myWorkspace". Most developers will solve this as StringUtil.toCamelCase("my_workspace"). But, again, I am going to argue that the method simply belongs to the String class and should be "my_workspace".toCamelCase(). But we can't extend the String class in Java, so we are stuck, but in many other OO languages you can add methods to existing classes.

In the end I am sometimes (handful of times per year) forced to write static methods due to limitation of the language. But that is a rare event since static methods are death to testability. What I do find, is that in most projects static methods are rampant.

Instance Methods

So you got rid of all of your static methods but your codes still is procedural. OO says that code and data live together. So when one looks at code one can judge how OO it is without understanding what the code does, simply by looking at the relationship of data and code.

class Database {
 // some fields declared here
 boolean isDirty(Cache cache, Object obj) {
   for (Object cachedObj : cache.getObjects) {
     if (cachedObj.equals(obj))
       return false;
   }
   return true;
 }
}

The problem here is the method may as well be static! It is in the wrong place, and you can tell this because it does not interact with any of the data in the Database, instead it interacts with the data in cache which it fetches by calling the getObjects() method. My guess is that this method belongs to one of its arguments most likely Cache. If you move it to Cache you well notice that the Cache will no longer need the getObjects() method since the for loop can access the internal state of the Cache directly. Hey, we simplified the code (moved one method, deleted one method) and we have made Demeter happy.

The funny thing about the getter methods is that it usually means that the code where the data is processed is outside of the class which has the data. In other words the code and data are not together.

class Authenticator {
 Ldap ldap;
 Cookie login(User user) {
   if (user.isSuperUser()) {
     if ( ldap.auth(user.getUser(),
            user.getPassword()) )
       return new Cookie(user.getActingAsUser());
   } else (user.isAgent) {
       return new Cookie(user.getActingAsUser());
   } else {
     if ( ldap.auth(user.getUser(),
            user.getPassword()) )
       return new Cookie(user.getUser());
   }
   return null;
 }
}

Now I don't know if this code is well written or not, but I do know that the login() method has a very high affinity to user. It interacts with the user a lot more than it interacts with its own state. Except it does not interact with user, it uses it as a dumb storage for data. Again, code lives with data is being violated. I believe that the method should be on the object with which it interacts the most, in this case on User. So lets have a look:

class User {
 String user;
 String password;
 boolean isAgent;
 boolean isSuperUser;
 String actingAsUser;

 Cookie login(Ldap ldap) {
   if (isSuperUser) {
     if ( ldap.auth(user, password) )
       return new Cookie(actingAsUser);
   } else (user.isAgent) {
       return new Cookie(actingAsUser);
   } else {
     if ( ldap.auth(user, password) )
       return new Cookie(user);
   }
   return null;
 }
}

Ok we are making progress, notice how the need for all of the getters has disappeared, (and in this simplified example the need for the Authenticator class disappears) but there is still something wrong. The ifs branch on internal state of the object. My guess is that this code-base is riddled with if (user.isSuperUser()). The issue is that if you add a new flag you have to remember to change all of the ifs which are dispersed all over the code-base. Whenever I see If or switch on a flag I can almost always know that polymorphism is in order.

class User {
 String user;
 String password;

 Cookie login(Ldap ldap) {
   if ( ldap.auth(user, password) )
     return new Cookie(user);
   return null;
 }
}

class SuperUser extends User {
 String actingAsUser;

 Cookie login(Ldap ldap) {
   if ( ldap.auth(user, password) )
     return new Cookie(actingAsUser);
   return null;
 }
}

class AgentUser extends User {
 String actingAsUser;

 Cookie login(Ldap ldap) {
   return new Cookie(actingAsUser);
 }
}

Now that we took advantage of polymorphism, each different kind of user knows how to log in and we can easily add new kind of user type to the system. Also notice how the user no longer has all of the flag fields which were controlling the ifs to give the user different behavior. The ifs and flags have disappeared.

Now this begs the question: should the User know about the Ldap? There are actually two questions in there. 1) should User have a field reference to Ldap? and 2) should User have compile time dependency on Ldap?

Should User have a field reference to Ldap? The answer is no, because you may want to serialize the user to database but you don't want to serialize the Ldap. See here.

Should User have compile time dependency on Ldap? This is more complicated, but in general the answer depends on weather or not you are planning on reusing the User on a different project, since compile time dependencies are transitive in strongly typed languages. My experience is that everyone always writes code that one day they will reuse it, but that day never comes, and when it does, usually the code is entangled in other ways anyway, so code reuse after the fact just does not happen. (developing a library is different since code reuse is an explicit goal.) My point is that a lot of people pay the price of "what if" but never get any benefit out of it. Therefore don't worry abut it and make the User depend on Ldap.

Google Operating System News

Tuesday, 26 July 2011

How to think about OO

How to think about OO

No comments:

Post a Comment

Blog Archive