Monday, November 30, 2009

quick script: open hadoop jobtracker UI with elastic map reduce

If you've ever logged into the hadoop master with amazon's elastic map reduce, you'll see something like:

The Hadoop UI can be accessed via the command: lynx http://localhost:9100/

Great, but lynx?.. not as nice as firefox or safari...

It's easy enough to do some ssh port forwarding so you can use your browser of choice and access the hadoop UI from your machine.

But, after getting tired of typing in the ssh options a bunch of times, I finally put together a short script that automates it a bit. The script takes in the public hostname of your hadoop master (you can get this from elastic-mapreduce --list), then picks a random port number, sets up the ssh forwarding, and opens the page in a new browser window.

I call it hcon for 'hadoop console'. After configuring the script with the path to your emr key file, you run it like:

hcon ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws.com

Here's the full script, but in case you're curious the magic lines (wrapped) are:

ssh -f -N -o "StrictHostKeyChecking no" \
-L ${LPORT}:localhost:9100 \
-i ${KEYFILE} hadoop@${HOST}
$BROWSER http://localhost:${LPORT}

(Yes, for this, I turn off StrictHostKeyChecking).

Anyway, try it out and let me know if it's helpful at all.

1 comment:

Anonymous said...

You are the hero of my day! No more digging around in Lynx via SSH to obtain status info from my EMR job flows or log files during debugging :)