Tuesday, May 11, 2010

Hackday: dependency searching using scala, jersey, gxp, mongodb

For my hackday project, I thought I would try to build an internal tool to let us more easily search our dependency repository. We use ivy for dependency management, and maintain our own repository in s3. It can be kind of a pain to track down the latest version of library X, especially if you're not sure what the organization is, or maybe you know the org and not the name. It seemed like a fun, useful project that I could tackle in a day, and that would allow me to play around with a couple of things I was interested in. To build it, I used jersey, gxp, and mongoDB. The whole thing was written using scala.

I borrowed the main layout from the SpringSource Enterprise Bundle Repository. I'm pretty happy with the results:

And the detail view:

There's also a browse view.

I've been really happy using scala and jersey, and I wanted something simple and easy for this project, so I thought it was worth a shot. After adding GXP for templating support, I have to say the combination of scala/jersey/GXP makes a pretty compelling framework for simple web apps.

As an example, here's the beginning of my 'Browse' Controller:

class Browse {
val db = new RepoDB

@GET @Produces(Array("text/html"))
def browseOrg() = browseOrgLetter("A")

@GET @Produces(Array("text/html"))
def browseOrgLetter(@PathParam("letter") letter : String) = {

val orgs = db.getOrgLetters

val results = db.findByOrgLetter(letter, 30)

BrowseView.getGxpClosure("Organization", "o", orgs, letter, results)

It's using nested paths, so /b/o is the main browse by organization page, /b/o/G would be all organizations starting with 'G'.

Then, I have a simple MessageBodyWriter that can render a GxpClosure:

class GxpClosureWriter extends MessageBodyWriter[GxpClosure] {
val context = new GxpContext(Locale.US)

override def isWriteable(dataType: java.lang.Class[_], ...) = {

override def writeTo(gxp: GxpClosure, ...) {
val out = new java.io.OutputStreamWriter(_out)
gxp.write(out, context)

And, that's really all there is to it. Nice, simple, and lightweight.

Last but not least, mongodb. It was probably overkill for this project, but I was looking for an excuse to play with it some more. I use it to store and index all of the repository information. I have a separate crawler process that lists everything in our repository s3 bucket, then stores an entry for each artifact. As part of this, it does some basic tokenizing of the organization and artifact names for searching. Searching like this was a little disappointing compared to lucene. Overall though, I'm pretty happy with it. Browsing and searching are both ridiculously fast. Like I said, it was probably overkill for the amount of data we have.... but it can never be too fast. speed is most definitely a feature.

Anyway, that's the wrap-up.

I'd be interested to other thoughts/experiences on mongodb from anyone out there.

No comments: