Infrequently updated blog!

Things Calum may be responsible for

Here's a feature I made earlier

Saturday, 15 March 2008

After being suitably wowed by Max Bolingbroke's implementing the Disposable pattern from C# in my current language-of-obsession Scala, I was looking at some of the other features of C#/.NET which are quite nice, and wondering how they could be implemented.

One set of stuff that is available in C# is the much-talked-about LINQ system, which lets you combine and query lists in a way fairly similar to SQL. They also do a bunch of clever things (like actually translating these queries into SQL, from what I can tell) but let's gloss over that for now.

In any case, there was a few of these operators don't exist in the Scala library, and since it's fairly easy to push new things onto objects in Scala, I decided to see how easily I could model them. For kicks.

Before I start proper I should mention two things:

  • I'm assuming programming knowledge, and probably some Scala experience here. That said, if you're used to this type of language you shouldn't have too much trouble reading the code examples, and hopefully you'll like the way the language works. If there's any questions, though, just comment. I don't imagine I'm a great writer.
  • There is probably better ways to do this stuff; I'm not hugely experienced with the language (I have at least thought of alternate ways to do some of the things). I'd be interested to hear any other approaches people have on this.

This is the first of two posts. In this one I'm going to set the stage, and do something simple. In the next one I'm going to continue down the same road, and so something a little more complicated.

"Adding" our own methods onto an existing class

UPDATE: This follow-up article shows a way of doing this that should make the code a bit more re-usable. The principle is pretty much the same, though.

This is just a bit of preparation for other things. Trust me when I say that we're going to want to put (or appear to put) extra methods onto objects with the Iterable trait. It's what all the cool kids are doing.

Following the naming scheme put forth by the Scala standard library, we can create a RichIterable class which will house the extra stuff we're going to be putting on the normal Iterable class, and make it a class which "wraps" an existing iterable object. With an implicit type conversion function which will wrap it up transparently, we can treat any Iterable as a RichIterable whereever the function is imported. This is done like so:

object RichIterable
{
  implicit def iterable2RichIterable[A]( iterable: Iterable[A]) = new RichIterable( iterable )
}

class RichIterable[A]( inner: Iterable[A] )
{ 
  // Extra functions go here...
}

Now all one needs to do is add the line import whatever.package.RichIterable._ and the power of these new methods will be theirs.

Pushing things into a map – toMap

This first one is the simplest of the methods I decided to implement; it is based on the ToDictionary method available in .NET. Basically the motivation is this:

I have a set of items with some distinguising property. I would like to put them into a Map object keyed on this property so I can look them, or maybe some property derived from them, up quickly.

Here's a more concrete example; we have our typical terrible-example-of-object-orientation class representing a car by its registration plate number and the name of its driver, which looks like so:

class Car( registration: String, driver: String )
{
  val getRegistration = registration
  val getDriver = driver
}

We have a big list of cars (a List[Car], if you will), but what we really want is a map from the registration plate number to the driver name, so we can do something like this:

val guiltyDriver = registrationMap( registrationOfCarSeenLeavingInAHurry )

We'd like to generate this map quickly and easily, and this is where the toMap function comes in; we want to be able to do something like:

val registrationMap = carList.toMap( _.getRegistration, _.getDriver )

This is fairly easy to achieve via a number of means, but I chose to build the inputs to the immutable Map's factory method. It takes any number of two-item tuples (doubles) and gives you a Map object back. So to construct those tuples from values in the list in this case we could do something like this:

a => ( a.getRegistration, a.getDriver )

Or more generally for a given function for the key, and one for the value:

a => ( keyFunc(a), valFunc(a) )

So to map an entire Iterable object (like our List) using these two functions, we could do this:

val tuples = inner.map( a => ( keyFunc(a), valFunc(a) ) ).toStream

The toStream at the end is just there because the fussy Map constructor wants a Seq. Presumably because it wants some guarantee that there's a finite number of elements. Wuss.

So to put this into a function we just need to take in the two functions, and push it out to the Map factory method:

def toMap[K,V]( keyFunc: (A) => (K), valFunc: (A) => (V) ): Map[K,V]  = {
  val tuples = inner.map( a => ( keyFunc(a), valFunc(a) ) ).toStream
    
  Map( tuples: _* )
}

This uses K for the type of the key, and V for the type of the value. The _* type parameter makes the Seq of tuples appear like a bunch of arguments, since this isn't automatic (this is similar to params in C# or the ... thing in Java).

This now works fine; so we can do something like this:

import RichIterable._
  
val cars = List(
    new Car( "SANTA1", "Santa Claus" ),
    new Car( "PANDA1", "P.C. Plod" ),
    new Car( "12345", "Count Count" ) )
      
val registrationMap = cars.toMap( _.getRegistration, _.getDriver )

println( registrationMap )
  
println( registrationMap("SANTA1") )

And we get the output:

Map(SANTA1 -> Santa Claus, PANDA1 -> P.C. Plod, 12345 -> Count Count)
Santa Claus

...which is of course ridiculous, since Santa rides in a sleigh rather than driving a car.

For convenience, we can also fulfil the situation where we just want to push our objects into a map keyed by the result of a function; that is, the same as the above, but the value should just be the input object. This is likely to be a fairly common use-case, so let's not make people type a lot to do it:

def toMap[K]( keyFunc: (A) => (K) ): Map[K, A] = toMap( keyFunc, a => a )

This is just calling the existing toMap function with a => a (the identity function, a fancy name for a function which does nothing at all) as valFunc.

In the next entry, I'm going to do something more substantial; an analogue to GROUP BY in SQL, which was the feature of LINQ I wanted the most. If you're too anxious to wait, this code is available here. It's a little untidy and uncommented at present, though (and doesn't currently contain a bunch of improvements I've since added).

Labels: ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home