Skip to content


TalkingPuffin Twitter API Enhancement Samples

Dave recently pushed some of my API changes to the main TalkingPuffin. There are quite a few updates. The API is more complete now, more resilient, supports optional REST arguments, and has a method to load all pages of various APIs. I thought I’d show a few of the enhancements here.

This first listing shows how to use the new TwitterArgs class.

package org.talkingpuffin.twitter
 
object ShowAPI {
 
  def main(args: Array[String]) = {
    // set up our credentials and session
    val user = "foo"
    val password = "bar"
    val sess = TwitterSession(user,password)
    // same method as before
    var friendsTweets = sess.getFriendsTimeline();
    System.out.println("got " + friendsTweets.size + " tweets from old style method")
    // use per page count
    friendsTweets = sess.getFriendsTimeline(TwitterArgs.maxResults(200))
    System.out.println("got " + friendsTweets.size + " tweets using per page count")
    // use chained twitter args
    friendsTweets = sess.getFriendsTimeline(TwitterArgs.maxResults(200).page(2))
    System.out.println("got " + friendsTweets.size + " tweets using chained twitter args")
  }
}

There are a variety of methods that support passing in a TwitterArgs instance. These can be constructed by calling the various methods on the TwitterArgs object, e.g.

val args = TwitterArgs.maxResults(200)

If you want to pass in multiple optional arguments you can make calls on an existing args instance, e.g.

val args = oldArgs.page(2)

These get converted to a URL query segment, and are appended to the URLs called for twitter data.

The next listing shows some Scala functional programming neatness, and builds on my previous post about retry logic.

package org.talkingpuffin.twitter
 
object ShowAPI {
 
  def main(args: Array[String]) = {
    // set up our credentials and session
    val user = ""
    val password = ""
    val sess = TwitterSession(user,password)
 
    // demo load all... this just loads the first page
    var myTweets = sess.getUserTimeline("mccv")
    System.out.println("got " + myTweets.size + " tweets from my timeline")
    // now we show load all.  loadAll just wants a function that takes an int as an arg,
    // which is the page.  Scala's partially applied functions make this pretty easy
    // to use in a general purpose way
    myTweets = sess.loadAll(sess.getUserTimeline("mccv",_:Int))
    System.out.println("got " + myTweets.size + " tweets using loadAll")
    // this is even fancier.  Here I add retry logic to load all.
    // note that retryPage is a function I defined here... but the session
    // doesn't care.  It keeps iterating through pages, retrying and loading
    // until it reaches the end.
    myTweets = sess.loadAll(retryPage(_:Int,sess.getUserTimeline("mccv",_:Int)))
    System.out.println("got " + myTweets.size + " tweets using loadAll and retries")
  }
 
    /**
    * this is a function that is sort of a thunk through to tryNTimes.
    */
    def retryPage[T](page:Int, func: (Int) => T):T = {
        // here we define a privately scoped function
        // that can be passed to tryNTimes
        def tryPage() = {
            func(page)
        }
        // and now we try N (5) times
        tryNTimes(tryPage,5)
    }
 
    /**
    * from the last blog post, a retrier
    */
    def tryNTimes[T](func: () => T, runNumber: Int):T = {
      try{
          func()
      } catch {
        case e if runNumber > 1 => tryNTimes(func,runNumber - 1)
        case e => throw e
      }
    }
 
}

Hopefully this code is more or less self documenting. The first session call just gets the first page of the user timeline. This is usually sufficient for writing a Twitter client, but if you are doing data mining it isn’t so great. The new API introduces a method called loadAll of type (f:(Int) => List[T]) => List[T]. This means that any method that takes a single int argument (a page number) and returns a list can be passed to loadAll. It keeps executing the passed in function with increasing page numbers until an empty list is returned (note that this must be the behavior on page overruns as currently implemented. If the overrun URI returns a 404 we’ll get an exception thrown. Luckily Twitter currently just returns an empty list).

The second call shows this in action. It’s using a slightly more complicated case, because getUserTimeline takes a String and an Int. Scala’s partially applied functions make this a snap. The line

sess.getUserTimeline("mccv",_:Int)

Takes the getUserTimeline call with one bound argument and one unbound. It returns a function of type (Int) => List[TwitterStatus], which is exactly what loadAll wants.

The third call is even more complicated um, sophisticated. Let’s say we want to retry operations five times, just in case we get dropped connections in the middle of a big load. Well, all we need to do is get a function that takes an int and returns a list into loadAll.

In a previous blog post I wrote about implementing a retryable method. You can see this more or less unchanged at the end of the file. Unfortunately its signature isn’t quite what we want. So we define retryPage, which acts as an adapter from loadAll to tryNTimes. With this setup in place, we can set up our last call, which uses two partially applied functions. The first converts getUserTimeline into the page-argument-only form, and the second converts retryPage into a page-argument-only form.

Running the two samples combined this gives us the following output

got 20 tweets from old style method
got 199 tweets using per page count
got 200 tweets using chained twitter args
got 20 tweets from my timeline
got 824 tweets using loadAll
trying to get page 1
trying to get page 2
trying to get page 3
trying to get page 4
trying to get page 5
trying to get page 6
trying to get page 7
trying to get page 8
trying to get page 9
trying to get page 10
trying to get page 11
trying to get page 12
trying to get page 13
trying to get page 14
trying to get page 15
trying to get page 16
trying to get page 17
trying to get page 18
trying to get page 19
trying to get page 20
trying to get page 21
trying to get page 22
trying to get page 23
trying to get page 24
trying to get page 25
trying to get page 26
trying to get page 27
trying to get page 28
trying to get page 29
trying to get page 30
trying to get page 31
trying to get page 32
trying to get page 33
trying to get page 34
trying to get page 35
trying to get page 36
trying to get page 37
trying to get page 38
trying to get page 39
trying to get page 40
trying to get page 41
trying to get page 42
trying to get page 43
got 824 tweets using loadAll and retries

Posted in scala.


A slightly cleaner Java Retryable

In my last post I walked through an implementation of a class that retries an operation N times before failing. However after reading this post I realized that using Java’s Callable class cleans things up quite a bit. This completely removes the need for a Retryable interface, and leaves you with Retrier implemented as

package org.mccv;
 
import java.util.concurrent.Callable;
 
/**
 * The implementation of our retrying class
 */
public class Retryer<T> {
	/**
	 * The retryable we will call
	 */
	private Callable<T> retryable;
 
	/**
	 * Just assign our retryable field 
	 */
	public Retryer(Callable<T> retryable){
		this.retryable = retryable;
	}
 
	/**
	 * Try to execute our retryable n times 
	 */
	public T tryTimes(int times) throws Exception{
		// store the last thrown recoverable exception
		RecoverableException lastException = null;
		// try the specified number of times
		for(int i = 0; i < times; i++){
			System.out.println("running it with " + (times-i) + " tries remaining");
			try{
				return retryable.call();
			}catch(RecoverableException re){
				lastException = re;
			}
		}
		throw lastException;
	}
}

And an implementation/usage demonstration as

package org.mccv;
 
import java.util.concurrent.Callable;
 
/**
 * An implementation of the Retryable interface
 */
public class RetryableImpl implements Callable<String>{
	/**
	 * Show usage of the retryer
	 */
	public static void main(String [] args){
		// create a new retrier
		Retryer<String> retrier = new Retryer<String>(new RetryableImpl());
		// run the retrier 
		try{
			System.out.println("result = " + retrier.tryTimes(3));
		}catch(Exception e){
			System.out.println("failed after 3 tries");
		}
	}
 
	/**
	 * An intentionally flaky operation
	 */
	public String call() throws Exception{
	    if(Math.random() > 0.3){
	        throw new RecoverableException();
	      }else{
	        return "foo";
	      }
 
	}
}

The Scala version is still cleaner, but this Java version is a substantial improvement over the first version. As Andrew said in the biotext post, getting to know (or in my case remembering) the more advanced Java concurrency classes is a good thing.

Posted in Uncategorized.


Things that are easier in Scala vol 1

Last week I ran into a cose where needed to add some logic to retry certain operations a number of times before failing. These operations can throw a variety of exceptions, some of which are recoverable, some of which are not. Due to a variety of issues the code in question must be Java. At least for the time being. So I set out to do this. The solution I came up with wasn’t bad as Java solutions go…

Step 1: identify recoverable exceptions. Because I control the code that can throw the exceptions this isn’t so bad. I create a simple RecoverableException class.

package org.mccv;
 
/**
 * A marker class for recoverable exceptions
 */
public class RecoverableException extends Exception{
 
}

Step 2: create an interface that defines a retryable operation.

package org.mccv;
 
/**
 * An interface that defines an execute operation to retry
 */
public interface Retryable<T> {
	public T execute() throws Exception;
}

Step 3: implement a class that will retry a retryable n times before failing.

package org.mccv;
 
/**
 * The implementation of our retrying class
 */
public class Retryer<T> {
	/**
	 * The retryable we will call
	 */
	private Retryable<T> retryable;
 
	/**
	 * Just assign our retryable field 
	 */
	public Retryer(Retryable<T> retryable){
		this.retryable = retryable;
	}
 
	/**
	 * Try to execute our retryable n times 
	 */
	public T tryTimes(int times) throws Exception{
		// store the last thrown recoverable exception
		RecoverableException lastException = null;
		// try the specified number of times
		for(int i = 0; i < times; i++){
			System.out.println("running it with " + (times-i) + " tries remaining");
			try{
				return retryable.execute();
			}catch(RecoverableException re){
				lastException = re;
			}
		}
		throw lastException;
	}
}

Step 4: implement the retryable interface with the operation we want retried

package org.mccv;
 
/**
 * An implementation of the Retryable interface
 */
public class RetryableImpl implements Retryable<String>{
	/**
	 * Show usage of the retryer
	 */
	public static void main(String [] args){
		// create a new retrier
		Retryer<String> retrier = new Retryer<String>(new RetryableImpl());
		// run the retrier 
		try{
			System.out.println("result = " + retrier.tryTimes(3));
		}catch(Exception e){
			System.out.println("failed after 3 tries");
		}
	}
 
	/**
	 * An intentionally flaky operation
	 */
	public String execute() throws Exception{
	    if(Math.random() > 0.3){
	        throw new RecoverableException();
	      }else{
	        return "foo";
	      }
 
	}
}

This works, and you can use anonymous inner classes to wrap stuff that doesn’t directly extend retryable. But it’s ugly. You have to make a lot of things final you wouldn’t otherwise (for instance let’s say you want your retryable to set a status code and response message. This gets difficult due to having to mark things final). It’s also a bit clumsy in that you have a Retryer class in play, plus an implementation of Retryable. What I really wanted was a way to say

retry(myOperation,nTimes).

With Scala I got a lot closer.

package org.mccv
 
/**
 * A marker trait indicating an exception that is recoverable.
 * Note that we could make this a trait as well.
 */
class RecoverableException extends Exception
 
/**
 * Usage of our retryable class
 */
object Retryer {
	/**
	* The workhorse.  Runs the operation, and if successful returns the result.
	* If unsuccessful calls itself recursively with a decremented run number.
	*/
	def tryNTimes[T](func: () => T, runNumber: Int):T = {
	  println("running it with " + runNumber + " tries remaining")
	  try{
		  func()
	  } catch {
	    case e:RecoverableException if runNumber > 1 => tryNTimes(func,runNumber - 1)
	    case e => throw e
	  }
	}
}
 
object Main {
   def main(args: Array[String]):Unit = {
    val tries = 3;
    try{
    	println(Retryer.tryNTimes[String](flakyMethod,tries))
     }catch{
       case e => println("failed after " + tries + " tries")
     }
  }
 
  def flakyMethod():String = {
    if(Math.random > 0.3){
      throw new RecoverableException()
    }else{
      "foo"
    }
  }
}

This is almost exactly what I’d like. I define a simple object that has a tryNTimes method. Because it’s an object it’s sort of like a Java static method… I can call Retryer.tryNTimes() directly. This is a recursive call. If a RecoverableException is caught, it just calls tryNTimes with the number of tries decremented. The beauty of this is that you can pass in any zero argument function. It doesn’t matter what class it’s defined in… it just has to know about recoverable exceptions.

I’ll be keeping an eye out for things that are easier in scala and keeping this series up to date. I’m sure I’ll be running into more of them.

Posted in scala.


You can’t clone superstars

Something that comes up fairly frequently in my organization is the question of how to replicate success. When team A does really really well, there’s an urge to copy whatever magic it is they had. Unfortunately the approach usually devolves into how to copy the successful team, not copy the success of that team.

As an example, let’s say team A had a fantastic development lead. Somebody who could keep an eye out for development progress, overall quality, and what the customer really wanted. This development lead was granted fairly wide ranging influence over various parts of the project. They managed quality assurance, development, and were the final authority on what features made it into the project. And everything went great.

So now team B is spinning up. They have a good, but not fantastic development lead. However they do have a few quality assurance engineers who are rock stars, and a very good project manager. Should you grant the same authority to the development lead you did for team A? Of course not.

Replicating success is a matter of first getting good people. In big companies this can be fairly tough, as you usually have to play the hand you’re dealt. Teams often get paralyzed as they try to find the equivalent of the rock star development lead from team A. It’s critical to identify whether you can alter the members of the team (or to what extent you can swap out the members of the team) before planning the structure of that team.

After getting the people, the most critical factor is structuring the team so that team strengths are provided the freedom to succeed, and team weaknesses are appropriately supported. This is rarely accomplished by just applying the template from another team. The good leaders understand this, and are constantly looking for the factors that make a team successful. They then carefully consider the makeup of the next team, and put a structure in place to allow those specific team members to succeed.

Posted in organization.


Getting Started With The Talking Puffin Twitter API

Talking Puffin is a full blown desktop Twitter client, but it also includes an independently consumable Twitter API. This provides a Scala implementation of most Twitter API functions, buildable with Maven and includable in other projects.

The first step in using the Twitter API is getting source. You can either download this from http://github.com/dcbriccetti/talking-puffin/ or use git to clone your own repository.

The next step is building the code. Go to the base directory of the project (the one with README.md), and execute mvn install. This will run the scala compiler on both the desktop and Twitter API projects and install them into your Maven repository. Note that you can also do this from the twitter-api subdirectory if you only want to build the Twitter API.

After code has successfully built you can interact with the Twitter API via the scala console. To do this, go to the twitter-api directory and run mvn scala:console. You should see something like the following

mcmac:twitter-api mmcbride$ mvn scala:console
[INFO] Scanning for projects...
[INFO] Searching repository for plugin with prefix: 'scala'.
[INFO] ------------------------------------------------------------------------
[INFO] Building TalkingPuffin Twitter API
[INFO]    task-segment: [scala:console]
[INFO] ------------------------------------------------------------------------
[INFO] Preparing scala:console
[INFO] [resources:resources]
[INFO] Using default encoding to copy filtered resources.
[INFO] [compiler:compile]
[INFO] Nothing to compile - all classes are up to date
[INFO] [scala:compile {execution: default}]
[INFO] Checking for multiple versions of scala
[INFO] Nothing to compile - all classes are up to date
[INFO] [scala:console]
[INFO] Checking for multiple versions of scala
Welcome to Scala version 2.7.3.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_07).
Type in expressions to have them evaluated.
Type :help for more information.

scala>

In the console you can now import the Twitter API package and create a session.

cala> import org.talkingpuffin.twitter._
import org.talkingpuffin.twitter._
import org.talkingpuffin.twitter._
 
scala> val sess = TwitterSession(twitterUser,twitterPassword)
val sess = TwitterSession(twitterUser,twitterPassword)
sess: org.talkingpuffin.twitter.AuthenticatedSession = org.talkingpuffin.twitter.AuthenticatedSession@9fbc913

Once you have a session, you can call twitter methods… this example shows us getting messages from the public timeline

scala> val timeline = sess.getFriendsTimeline("mccv")
val timeline = sess.getFriendsTimeline("mccv")
timeline: List[org.talkingpuffin.twitter.TwitterStatus] = List(org.talkingpuffin.twitter.TwitterStatus@73ec8c2, org.talkingpuffin.twitter.TwitterStatus@2aee3c45, org.talkingpuffin.twitter.TwitterStatus@7eb6ec07, org.talkingpuffin.twitter.TwitterStatus@1b42008f, org.talkingpuffin.twitter.TwitterStatus@a32ba44, org.talkingpuffin.twitter.TwitterStatus@862cb97, org.talkingpuffin.twitter.T...

The object returned is a List of TwitterStatus objects. The TwitterStatus object is a container for a single tweet, and has fields for most of the elements of the XML returned from the Twitter API. The following example shows a way to iterate over this list and print out the user and text. Note that the user field of the TwitterStatus object is a TwitterUser object, which contains info about a user.

scala> timeline foreach {(x) => println(x.user.name + "--\t" + x.text)}
timeline foreach {(x) => println(x.user.name + "--\t" + x.text)}
Dave Winer--	Switched at Birth, Women Find New Identity. http://tr.im/lixe
Dave Winer--	Yes, there is something super-ironic about a long list of URL shorteners. :-) http://tr.im/li3i
Robert Dempsey--	Yes, I'm on a bus in Austin.   http://yfrog.com/emz3oj
Dave Winer--	The NY Times/Twitter feed (Scripting News). http://tr.im/lixY
Dave Winer--	No LOST spoilers!!
Twitter API--	Starting something serious on the API? Want to launch your project May 26? Launch at @140tc: http://bit.ly/8jj8x ^DW
Dave Briccetti--	Trying to get more RAM on my shit dedicated server, davebsoft.com. Shit because OLM wanted a fortune per month to add dirt-cheap RAM.
Michael Arrington--	FriendFeed Enables People/Group Tracking http://tcrn.ch/1t4 by @parislemon
rick--	OMG LOST finale in a couple hours
Steve Spalding--	A hard lesson to learn is that you can't hand people success. The best you can do is give them the tools.
Michael Arrington--	Google's Last MySpace Payment: $75 Million On June 20, 2010 http://tcrn.ch/1t9 by @arrington
Alex Payne--	On my way to have a beer with two of the @dropbox guys. Once tipsy, I may request more jiggabytes.
Robert Dempsey--	OH: "They all run in the same circles. They fight, they hang out, they have transvestite beauty pageants."
Albert Wenger--	On the ground at SFO - meeting up with @joshu for dinner.
NewsGang--	Calling all iPhones! Emergency scanner apps on the loose! (from Steven Sande) : Filed under: Software, Odds and .. http://tinyurl.com/qpom96
NewsGang--	Quite Possibly the Coolest Hot Tub You Have Ever Seen | dornob (from dornob.com) http://tinyurl.com/p2r54h
Dave Winer--	Red tree: http://tr.im/liMR
Dave Winer--	Guerilla Cafe, Berkeley: http://tr.im/liMS
Dave Winer--	Tree with red leaves in sunset: http://tr.im/liMT
Dave Troy--	There will be a 2010 re-enactment of the 1871 Baltimore parade celebrating the ratification of the 15th amendment (my college thesis).
scala>

Note that we can get a bit fancier with this and use for comprehensions to do more or less the same thing. The following sample uses the API to filter the friends timeline, looking for any text with the word “Berkeley”

scala> for(x <- sess.getFriendsTimeline("mccv") if x.text.indexOf("Berkeley") > 0) println(x.text)
for(x <- sess.getFriendsTimeline("mccv") if x.text.indexOf("Berkeley") > 0) println(x.text)
Guerilla Cafe, Berkeley: http://tr.im/liMS
 
scala>

This should give you a pretty decent start on using the API. There are many more functions available, and you can take a look by generating the scala doc and checking out the API methods. To generate the site, go to your twitter-api directory and run “mvn site”. This should generate a set of HTML pages in target/site. Open index.html, click project reports, click scala docs, and look at the generated documentation for AuthenticatedSession.

Posted in scala.


Everybody fights

I really like Starship Troopers. Note that this is not to say that I in any way like Starship Troopers. One of things that caught me was the notion that in the mobile infantry everybody fights. Cooks, captains, chaplains… when it comes time to drop everybody has skin in the game.

There have been a variety of things bothering me about the latest project I’m on, and I think I’ve finally identified the root cause. There are a number of people who have a say… but when it comes time to get work done they don’t/can’t get their hands dirty and get things done. We’re entering planning for the next six to eighteen months of our project, and one of my big pushes is going to be reducing the number of people who don’t fight. Feedback and collaboration are nice. But if you want a seat at the decision making table, be ready to fight.

Posted in agile.


Scala exceptions vs. pattern matching

I’ve recently been working on the talking puffin project, particularly on the lower level twitter API. At the plumbing level, we want a class that allows us to fetch XML from a URL. In Java doing this sort of stuff usually involves quite a few checked exceptions. The interwebs are flakey, and ignoring that fact usually leads to really bad things happening. However Scala doesn’t have checked exceptions… and that’s a nice thing for flexibility, conciseness, etc. So what to do?

My two basic approaches were

  1. Throw exceptions on errors. this allows more concise code when you don’t care about the errors (maybe you’re dealing with them higher up)
  2. Use case classes with subclasses indicating successful and unsuccessful execution. This forces you to deal with error conditions right away, which may lead to more stable code

Let’s take a look at implementations of both. First we set up a simple base class to deal with HTTP plumbing. This just provides a way to open a URL and a helper method to read the error stream into a string.

class HttpBase{
  /**
  * Open the specified URL and return a tuple containing the corresponding
  * HTTP response code and HttpURLConnection
  */
  def openConnection(urlStr:String) = {
    val url = new URL(urlStr)
    val conn = (url.openConnection).asInstanceOf[HttpURLConnection]
    val responseCode = conn.getResponseCode()
    (responseCode,conn)
  }
 
  /**
  * Utility to open a connection's error stream and read it into a string'
  */
  def getErrMsg(conn:HttpURLConnection) = {
    var errMsg = ""
    val reader = new BufferedReader(
          new InputStreamReader(conn.getErrorStream()))
    var line = reader.readLine()
    while(line != null){
      errMsg += line
      line = reader.readLine()
    }
    errMsg
  }
}

Now let’s try to implement a class that actually fetches XML, and throws an exception if errors are encountered.

case class HttpException(val msg:String, val code:Int) extends Exception
 
/**
* Provides a getXML method that fetches XML from a URL.
* Throws an exception if any errors are encountered
*/
class HttpXMLExceptions extends HttpBase{
  def getXML(urlStr:String):Node = {
    try{
      openConnection(urlStr) match {
        case (200,conn) => XML.load(conn.getInputStream())
        case (code,conn) => throw HttpException(getErrMsg(conn),code)
      }
    }catch{
      case e => throw HttpException(e.toString,-1)
    }
  }
}

And here’s a usage sample. This is concise, but (by design) we aren’t forced to think about nasty things like 404s, 401s, dropped connections, broken pipes, and all sorts of other networky hobgoblins.

    val excs = new HttpXMLExceptions()
    val url = "http://twitter.com/statuses/public_timeline.xml"
    val content = excs.getXML(url)

Let’s try to force people to think about them. Instead of throwing an exception, we can define a hierarchy of case classes for our responses, like so.

case class Response
case class Success(val content:Node) extends Response
case class Error(val msg:String,val code:Int) extends Response

Our method will return a type of Response. If our request is successful we’ll get a Success object that has a content field. If we run into any errors we’ll instead get an Error object back with a response code and error message from the response body. This is similar to using Scala’s Option type, however this allows us to provide information on failure instead of simple returning the None instance. Here we go…

/**
* Proveds a getXML method that fetches XML from a URL.
* Always returns a Response object, regardless of success/failure
*/
class HttpXMLMatches extends HttpBase{
  def getXML(urlStr:String):Response = {
    try{
      openConnection(urlStr) match {
        case (200,conn) => Success(XML.load(conn.getInputStream()))
        case (code,conn) => Error(getErrMsg(conn),code)
      }
    }catch{
      case e => Error(e.toString,-1)
    }
  }
}

This is pleasantly similar in implementation… However usage looks quite a bit different…

    val matches = new HttpXMLMatches()
    val url = "http://twitter.com/statuses/public_timeline.xml"
    val content = matches.getXML(url) match {
      case Success(node) => node
      case Error(msg,code) => 
        <error><msg>{msg}</msg><code>{code}</code></error>
    }

This is quite a bit more verbose. However it does send a message that developers need to be conscious of the potential error case. Let’s look at side by side usage when we add error handling for an unknown host

    val excs = new HttpXMLExceptions()
    val badUrl = "http://nohost.twitter.com/statuses/public_timeline.xml"
    val noContent = try{
      excs.getXML(badUrl)
    }catch{
      case HttpException(msg,code) => 
        <error><msg>{msg}</msg><code>{code}</code></error>
    }
 
    val matches = new HttpXMLMatches()
    val noMatchContent = matches.getXML(badUrl) match {
      case Success(node) => node
      case Error(msg,code) => 
        <error><msg>{msg}</msg><code>{code}</code></error>
    }

This usage actually looks pretty similar too. This is due to the fact that a try/catch evaluates like a pattern match. In the exception case your error conditions are separated from the main logic, which some may like and some may not.

The primary difference is that using the Response class hierarchy forces the user to consider the the unsunny day scenarios. I don’t think there’s a black and white rule on when to use either approach, but case classes provide you a way to expose error conditions more visibly than unchecked exceptions. In the talking puffin project I think we’ll stick with exceptions, but it’s nice to have options.

Posted in scala.


How J2EE set architecture back a ways

I’ll start this off by saying this is pure opinion.  I don’t have any statistics.  Instead I have some gut intuition based on numerous “enterprise Java” projects within my company, and through observing the development community.

I’m currently involved in architecting a next generation systems management framework.  It has fairly hefty requirements… monitor and manage millions of objects distributed across the globe in real time.  Take out the “millions” bit, and this isn’t so bad.  There are a bunch of standards out there (JMX, SNMP), there are some off the shelf tools, and you can knock something together pretty quickly that covers a reasonable subset of equipment for small installs.  But we need to do better.   We need to manage more than just “most” of the equipment, and we need to do it in some of the world’s biggest data centers.  And as I talk about this with my peers from other groups,  I’m getting more convinced that J2EE has poisoned a good chunk of the current generation of architects.

I know this isn’t how J2EE  has to work, but in most projects, you start them like this…
1) Build your object model, use magic to map it to the DB
2) Build your business logic
3) Build your front end
4) Ship it!

And there you are.  You have a nice central database so you don’t have to worry about distributed data so much.  It’s all nice and safe on the disk.  You have some adapters to get stuff in and out, and a cozy UI that serves it all up to the user.  And it scales… for as much load as you can throw at your desktop.

Then you take it to some place big.  And it just flat out doesn’t work.  It’s not a matter of getting a bigger database, because you just can’t scale it up past a certain point. You can cluster, but now you need a professional services group to install your product, and the performance gains to be had aren’t crystal clear.

It becomes a matter of deciding what goes in there.  And when, and how.  But looking forward to these issues is met with fierce resistance in design meetings… Building distributed systems is much harder than building something around a central store.  Objections are typically based on reading marketing propaganda from database vendors.  Responses of the form “well, we’ll partition the data when we hit that load.  Or use a cluster… yeah, a cluster will fix everything” are extremely painful to work through.  After all, some “expert” from VendorX says it will work. Who are you, my peer, to assume you know more than me?  

But guess what?  Each of these objectors has a product in the field, based a central database, handling far less load than our targets, and none of them scale well.

Scaling out is beating scaling up.  Processors are going multi-core, not taking huge leaps forward in the GHz war.  Scala and Erlang are gaining in popularity, and with them a different model of parallelism (not new, just different) than that offered by J2EE.  The actor model embraced by both of those languages is primarily targeted at small processes within a service.  But as you look at larger chunks of the architecture, I think the sanest way forward is to embrace a similar model.  

ESB has been a buzzword for years now.  I don’t buy all the marketing hype around it, but at its core there is a model that the current generation of architects needs to get a handle on.  It’s very similar in concept to the Erlang/Scala actor model.  Yes, it’s harder to design around asynchronous messages and a large number of distributed components.  But that’s how you handle the big problems.  And that’s why you hire good developers to work on them, not just some random CS grad who happens to be able to regurgitate the latest Spring book.  

I’m not saying there isn’t a place for the traditional J2EE buildout (rails has a similar model, and works wonderfully for small-medium sized projects).  And in fact, even with these asynchronous systems there are almost certainly services that need a DB backing them.  But there are other tools that should be in your toolbelt.

My suggestion to Java architects today would be to pick up another language that fosters a totally different architecture.  I’m currently biased towards Ruby, Erlang or Scala.  There are no guarantees that any of these scale better than Java (and it’s fairly easily arguable that Ruby doesn’t).  But any of these will make you see the world in a different way, and allow you to more effectively consider and evolve your current architecture.

Posted in opinion.

Tagged with , , , , .


Scala Lift Off - Static Companion to Ruby?

So I went to the last half of Scala Lift Off on Saturday (only half, because the first half was taken up by my final MBA class.  Ever.).  I went primarily out of curiosity, not knowing much about Scala or Lift.  The main draw was the built in comet support for Lift, which seems to not be a focus in other frameworks… at least not for Rails.  We currently use Juggernaut for comet support, but depending on flash is something of a liability (see: iPhone), and Juggernaut itself isn’t as smoothly integrated with Rails as i’d like.

I came away extremely impressed.  Scala is relatively unheralded in the world of alternative JVM languages (see Groovy, Jython and JRuby publicity), but shows a lot of promise.  It’s a functional language with an expressive syntax that allows you to easily create code that looks DSL-ish.  These are the primary features that drew me to Ruby (ok, Ruby isn’t a functional language, but you can sorta fake it).  But Scala has a better integration story with existing Java libraries, is strongly typed, and has a stronger functional bent.

I’m a big believer in the right tool for the job, and as such don’t fall into a pure-dynamic or pure-static language camp.  I also don’t fall into a single language camp. I really enjoy Ruby for quick prototyping, and love Rails for quick prototyping of webapps, and maintaining a nimble production face on web applications.  But Rails falls down when I need to run background processing.  The times I think hardest about moving back to a Java webapp environment are when I need to go write something that doesn’t just receive a web request and terminate.  This is where concurrency issues get painful to deal with, Ruby daemons/DRb are painful, and starting up a whole Rails env for simple processing is rough.

So I’m hoping Scala/Lift fills that void.  I’m mentally sketching out a replacement of our background processing jobs (Twitter integration, email processing, etc.) with Scala, and in particular the Actors library.  These are relatively simple processing tasks, and should give me a decent feel for the language.  It should also improve the stability and scalability of our background processing.  It may also yield some reasonable libraries to contribute back (Scala Twitter library, Scala ActiveRecord bridge). 

Once I have that nailed down, an evaluation of how Lift can/should fit into the framework is in order… or maybe I’ll have to start my Rettiwt side project based on Lift.

Posted in scala.


System and Organizational Scaling - the Enterprise View

Albert Wenger put up a good post talking about the challenges faced by their startup portfolio, and how a vertical approach to subsystem division helps scaling the organization.  In fact the Web 2.0 landscape is very reflective of this approach.  10 years ago if Yahoo needed an authorization framework they would have built it themselves.  Today people use oAuth.  Flickr, Twitter, Delicious, Campfire, Meebo, S3… all very focused services that delegate non-core functionality to another service where possible.  For those things that are duplicated across services (web frameworks, database backends), nobody cares if they’re different… it’s hidden behind the service.

This is very attractive for the startup ecosystem.  The tangible results have been products that are cheaper to launch, quicker to market, and easy to adapt to customer feedback.  But how does it apply to the enterprise?

Issue 1:

For better or worse, in large engineering organizations people tend to care that the common horizontal components are indeed common.  If you work in a 100+ person engineering organization, and the rest of the team is using Spring + Struts for web development, it’s usually a tough sell to start using Rails.  If the organizations insists on this homogeneity, your vertically
sliced org is cross cut by the commonality police, hampering the
agility of the vertical team.  There are good reasons for this.  Somebody needs to support and test the thing.  If nobody else can deal with what you just built, it doesn’t have a path to market, and is therefore useless. 

Issue 2:

It’s easier to add 1 person or 1 feature to an existing service than it is to spin up a whole new service team.  Most organizations usually don’t have the luxury of five new reqs to apply to a service.  And they’re also hesitant to carve out five people from existing teams to spin up a new service.  So teams and services are usually built by accretion.  The result is usually a gradual march to collapse, as services become bloated and so difficult to maintain they need to be replaced.

Issue 3:

The organization only has capacity for a limited number of products.  A startup of 5-10 people can fully support the development, launch, and marketing of a vertical service.  In a large organization, the amount of infrastructure required to take a product to market means that unless you have a multi-million dollar revenue stream guaranteed in year 1 your chances of getting marketing, sales, training, doc, etc. spun up to support you are very small.  That means actual releases are larger than individual services.

Issue 4:

Coordination challenges across vertical services.  This ties in to issue 3.  Because a product release is a composition of multiple services, there is a desire for tight coordination across these services.  Contrast this to the Web 2.0 world.  Basecamp uses Amazon’s S3 storage service.  They are under no illusion that they can call the shots on S3’s feature roadmap.  And even more important, they would not plan for a release based on features that S3 has not committed to.  In the enterprise, these rules don’t apply.  Feature roadmaps for a collection of services are determined at the same time, and management hopes to coalesce them all into product at a predefined future date.

Possible Solutions

To address issue 1, the organization should focus on service capability, not underlying implementation.  To support this, service teams will need to be self sufficient, and should make implementation decisions as a team.  It needs to be tested.  Involve quality engineering in the technology selection.  It needs to be doced.  Involve doc.  At the end of the day the customer rarely cares if you use VB or Python to get them their value.

To address issue 2, the entire organization needs to be focused on keeping services focused on their core function.  When somebody needs feature X, run through the list of services.  Teams should be willing to say no to features not because they don’t
have capacity, but because the feature corrupts the purpose of the
service. If the feature doesn’t fit with any project, spin up a new service rather than accrete it onto an existing project.  This requires the organization to be flexible, to be willing to work on new things, and to give up old responsibilities.  It also demands a rejection of empire building.  You should take more pride in your project being small, focused, and absolutely fantastic at what it does, rather than measuring your worth by the number of people on your project. 

Issues 3 and 4 are the stickiest.  In the enterprise the opportunity for experimentation is much lower than it is for a consumer web offering.  Perhaps one option is to create a “labs” organization, that releases products for free for use by bleeding edge customers.  Services that are vetted through the “labs” channel could be productized at a later date. 

Coordination problems could be solved by eliminating planning that spans services. Take a snapshot of service capabilities, and plan from there.  This likely slows down development to a certain extent, but also makes individual plans more predictable, and significantly reduces coordination challenges.

Wrapup 

While challenges exist in the enterprise, I think it’s worthwhile to look at how we can make vertical slicing of the org work.  The advantages realized by Web 2.0 companies are compelling, and we definitely have challenges with our current horizontally structured groups.

Posted in organization.