Dave recently pushed some of my API changes to the main TalkingPuffin. There are quite a few updates. The API is more complete now, more resilient, supports optional REST arguments, and has a method to load all pages of various APIs. I thought I’d show a few of the enhancements here.
This first listing shows how to use the new TwitterArgs class.
package org.talkingpuffin.twitter object ShowAPI { def main(args: Array[String]) = { // set up our credentials and session val user = "foo" val password = "bar" val sess = TwitterSession(user,password) // same method as before var friendsTweets = sess.getFriendsTimeline(); System.out.println("got " + friendsTweets.size + " tweets from old style method") // use per page count friendsTweets = sess.getFriendsTimeline(TwitterArgs.maxResults(200)) System.out.println("got " + friendsTweets.size + " tweets using per page count") // use chained twitter args friendsTweets = sess.getFriendsTimeline(TwitterArgs.maxResults(200).page(2)) System.out.println("got " + friendsTweets.size + " tweets using chained twitter args") } }
There are a variety of methods that support passing in a TwitterArgs instance. These can be constructed by calling the various methods on the TwitterArgs object, e.g.
val args = TwitterArgs.maxResults(200)
If you want to pass in multiple optional arguments you can make calls on an existing args instance, e.g.
val args = oldArgs.page(2)
These get converted to a URL query segment, and are appended to the URLs called for twitter data.
The next listing shows some Scala functional programming neatness, and builds on my previous post about retry logic.
package org.talkingpuffin.twitter object ShowAPI { def main(args: Array[String]) = { // set up our credentials and session val user = "" val password = "" val sess = TwitterSession(user,password) // demo load all... this just loads the first page var myTweets = sess.getUserTimeline("mccv") System.out.println("got " + myTweets.size + " tweets from my timeline") // now we show load all. loadAll just wants a function that takes an int as an arg, // which is the page. Scala's partially applied functions make this pretty easy // to use in a general purpose way myTweets = sess.loadAll(sess.getUserTimeline("mccv",_:Int)) System.out.println("got " + myTweets.size + " tweets using loadAll") // this is even fancier. Here I add retry logic to load all. // note that retryPage is a function I defined here... but the session // doesn't care. It keeps iterating through pages, retrying and loading // until it reaches the end. myTweets = sess.loadAll(retryPage(_:Int,sess.getUserTimeline("mccv",_:Int))) System.out.println("got " + myTweets.size + " tweets using loadAll and retries") } /** * this is a function that is sort of a thunk through to tryNTimes. */ def retryPage[T](page:Int, func: (Int) => T):T = { // here we define a privately scoped function // that can be passed to tryNTimes def tryPage() = { func(page) } // and now we try N (5) times tryNTimes(tryPage,5) } /** * from the last blog post, a retrier */ def tryNTimes[T](func: () => T, runNumber: Int):T = { try{ func() } catch { case e if runNumber > 1 => tryNTimes(func,runNumber - 1) case e => throw e } } }
Hopefully this code is more or less self documenting. The first session call just gets the first page of the user timeline. This is usually sufficient for writing a Twitter client, but if you are doing data mining it isn’t so great. The new API introduces a method called loadAll of type (f:(Int) => List[T]) => List[T]. This means that any method that takes a single int argument (a page number) and returns a list can be passed to loadAll. It keeps executing the passed in function with increasing page numbers until an empty list is returned (note that this must be the behavior on page overruns as currently implemented. If the overrun URI returns a 404 we’ll get an exception thrown. Luckily Twitter currently just returns an empty list).
The second call shows this in action. It’s using a slightly more complicated case, because getUserTimeline takes a String and an Int. Scala’s partially applied functions make this a snap. The line
sess.getUserTimeline("mccv",_:Int)
Takes the getUserTimeline call with one bound argument and one unbound. It returns a function of type (Int) => List[TwitterStatus], which is exactly what loadAll wants.
The third call is even more complicated um, sophisticated. Let’s say we want to retry operations five times, just in case we get dropped connections in the middle of a big load. Well, all we need to do is get a function that takes an int and returns a list into loadAll.
In a previous blog post I wrote about implementing a retryable method. You can see this more or less unchanged at the end of the file. Unfortunately its signature isn’t quite what we want. So we define retryPage, which acts as an adapter from loadAll to tryNTimes. With this setup in place, we can set up our last call, which uses two partially applied functions. The first converts getUserTimeline into the page-argument-only form, and the second converts retryPage into a page-argument-only form.
Running the two samples combined this gives us the following output
got 20 tweets from old style method got 199 tweets using per page count got 200 tweets using chained twitter args got 20 tweets from my timeline got 824 tweets using loadAll trying to get page 1 trying to get page 2 trying to get page 3 trying to get page 4 trying to get page 5 trying to get page 6 trying to get page 7 trying to get page 8 trying to get page 9 trying to get page 10 trying to get page 11 trying to get page 12 trying to get page 13 trying to get page 14 trying to get page 15 trying to get page 16 trying to get page 17 trying to get page 18 trying to get page 19 trying to get page 20 trying to get page 21 trying to get page 22 trying to get page 23 trying to get page 24 trying to get page 25 trying to get page 26 trying to get page 27 trying to get page 28 trying to get page 29 trying to get page 30 trying to get page 31 trying to get page 32 trying to get page 33 trying to get page 34 trying to get page 35 trying to get page 36 trying to get page 37 trying to get page 38 trying to get page 39 trying to get page 40 trying to get page 41 trying to get page 42 trying to get page 43 got 824 tweets using loadAll and retries