My current solution is to allow myself to get distracted but to drag myself back to my EMR session as soon as it's available. Adding some simple polling plus a sticky growl notification to my interactive-emr-startup script does the trick quite nicely:
#!/bin/bash
if [ -z "$1" ]; then
echo "Please specify a job name"
exit 1
fi
elastic-mapreduce \
(... with all of my favorite options ...) \
| tee ${TMP_FILE}
JOB_ID=`cat ${TMP_FILE} | awk '{print $4}'`
rm ${TMP_FILE}
# poll for WAITING state
JOB_STATE=''
MASTER_HOSTNAME=''
while [ "${JOB_STATE}" != "WAITING" ]; do
sleep 1
echo -n .
RESULT=`elastic-mapreduce --list | grep ${JOB_ID}`
JOB_STATE=`echo $RESULT | awk '{print $2}'`
MASTER_HOSTNAME=`echo $RESULT | awk '{print $3}'`
done
echo Connecting to ${MASTER_HOSTNAME}...
growlnotify -n "EMR Interactive" -s -m "SSHing into ${MASTER_HOSTNAME}"
ssh $MASTER_HOSTNAME -i ~/.ssh/emr-keypair -l hadoop -L 9100:localhost:9100
One of my personal productivity goals for the year is finding little places like this that I can optimize with a short script. This particular one has rescued me from the clutches of HN more than once!
No comments:
Post a Comment