The one very annoying thing about Scala scripting is managing dependencies. My initial method was to have my bash preamble manually download the required libraries to the current directory and insert them onto the Scala classpath. So, my scripts looked something like this:
#!/bin/sh
if [ ! -f commons-lang.jar ]; then
s3cmd get [s3-location]/commons-lang.jar commons-lang.jar
fi
if [ ! -f google-collect.jar ]; then
s3cmd get [s3-location]/google-collect.jar google-collect.jar
fi
if [ ! -f hadoop-core.jar ]; then
s3cmd get [s3-location]/hadoop-core.jar hadoop-core.jar
fi
exec /opt/local/bin/scala -classpath commons-lang.jar:google-collect.jar:hadoop-core.jar $0 $@
!#
(scala code here)
This method has some rather severe scaling problems as the complexity of the dependency graph increases. I was about to step into the endless cycle of testing my script, finding the missing or conflicting dependencies, and re-editing it to download and include the appropriate files.
Fortunately, there was an easy solution. We're already using Ivy to manage our dependencies in our compiled projects, and Ivy can be run in standalone mode outside of ant. The key option to use is the "-cachepath" command line option, which causes Ivy to write a classpath to the cached dependencies to a specified file. So, now the preamble of my scripts looks like this:
#!/bin/bash
tempfile=`mktemp /tmp/tfile.XXXXXXXXXX`
/usr/bin/java -jar /mnt/bizo/ivy-script/ivy.jar -settings /mnt/bizo/ivy-script/ivyconf.xml -cachepath ${tempfile} > /dev/null
classpath=`cat ${tempfile} | tr -d "\n\r"`
rm ${tempfile}
exec /opt/local/bin/scala -classpath ${classpath} $0 $@
!#
(scala code here)
Now all I need is a standard ivy.xml file living next to my script, and Ivy will automagically resolve all of my dependencies and insert them into the script's classpath for me.
Crisis averted. Life is once again filled with joy and happiness.
No comments:
Post a Comment