Friday, March 25, 2016

JOB TRACKER & TASK TRACKER RUNNING ON HADOOP

Track jobs and tasks runing on hadoop using hadoop Admin ViewIf you need to run hadoop on a single node cluster, please refer this tutorial.

If you need to run hadoop on a multi node cluster, then refer this tutorial.
If you have issues setting up hadoop in either environment, please let me know. I can help you.

In this blog I'm going to tell you how we can monitor the hadoop jobs and tasks in both your local and server machines.

once you start the hadoop cluster, you can access the hadoop job interface using the urlhttp://localhost:50030/  (If you are runing hadoop on a multi node cluster, replace the localhost by the name of the job tracker. Here job tracker name is either the ip address of the job tracker node or the name you have configured for the job tracker's ip address in /etc/hosts file) .Here you can change this port by changing the hadoop job tracker http address in /conf/core-site.xml. In below example, I have changed my port from 50030 to 50031.






If hadoop is running on a server, In order to access this web interface you need to create a tunnel to port in your local machine from the port in the server's machine. To do so you can use the following command on the shell.








Image below shows a view of the job tracker web view. Here you can see running or historical jobs along with the job details like number of tasks per job, type of tasks iime in the job, time taken to execute the job and many more relevant data.












If you need to access the task tracker, then you can use the address http://localhost:50030/ (If you are runing hadoop on a multi node cluster replace the localhost by the task tracker name you want to monitor. In this case, task tracker name can be either the ip address of the task tracker node or the name you have assigned for that ip address in the /etc/hosts file).

If you need to change the port of task tracker, you can change it by changing the below property in your core-site.xml file.






As same for the Job Tracker, If hadoop is runing on a server, in order to access this page you need to create a tunnel for your local port as below.





Using the task tracker view, you can monitor tasks run on the relevant task tracker and their status. Screenshot of task tracker view is shown below.











If you have any issues accessing these monitoring tools, please let me know.

1 comment: