Topic
3 replies Latest Post - ‏2013-12-16T15:53:32Z by dvillegas
lilly0129
lilly0129
2 Posts
ACCEPTED ANSWER

Pinned topic Hadoop MapReduce job randomly freezes in "BigInsights Basic 1.4 64b"

‏2013-12-13T05:18:16Z |

I ran hadoop MapReduce job in BigInsights Basic 1.4 64b. My source code is almost exactly the same as WordCount 2.0 downloaded from

http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Example%3A+WordCount+v2.0

However, my program randomly freezes during execution. Most of the time it freezes at reduce stage ~21%. It just stopped there without any error message. The only thing I can do is to restart the job and it sometimes ran successfully till complete and the final result is correct.

Here is my screen output:

[idcuser@vhost0150 IHC]$ bin/hadoop jar wordcount.jar org.myorg.WordCount /user/idcuser/input/dataset1 /user/idcuser/output/wordcount/dataset1
13/12/13 00:01:25 INFO mapred.FileInputFormat: Total input paths to process : 401
13/12/13 00:01:26 INFO mapred.JobClient: Running job: job_201312122346_0003
13/12/13 00:01:27 INFO mapred.JobClient:  map 0% reduce 0%
13/12/13 00:01:36 INFO mapred.JobClient:  map 1% reduce 0%
13/12/13 00:01:39 INFO mapred.JobClient:  map 2% reduce 0%
13/12/13 00:01:41 INFO mapred.JobClient:  map 3% reduce 0%
13/12/13 00:01:43 INFO mapred.JobClient:  map 4% reduce 0%
13/12/13 00:01:45 INFO mapred.JobClient:  map 5% reduce 0%
13/12/13 00:01:47 INFO mapred.JobClient:  map 6% reduce 0%
13/12/13 00:01:49 INFO mapred.JobClient:  map 7% reduce 0%
13/12/13 00:01:51 INFO mapred.JobClient:  map 8% reduce 0%
13/12/13 00:01:53 INFO mapred.JobClient:  map 9% reduce 0%
13/12/13 00:01:55 INFO mapred.JobClient:  map 10% reduce 0%
13/12/13 00:01:58 INFO mapred.JobClient:  map 11% reduce 0%
13/12/13 00:02:01 INFO mapred.JobClient:  map 12% reduce 0%
13/12/13 00:02:02 INFO mapred.JobClient:  map 13% reduce 0%
13/12/13 00:02:05 INFO mapred.JobClient:  map 14% reduce 0%
13/12/13 00:02:08 INFO mapred.JobClient:  map 15% reduce 0%
13/12/13 00:02:09 INFO mapred.JobClient:  map 16% reduce 0%
13/12/13 00:02:12 INFO mapred.JobClient:  map 17% reduce 0%
13/12/13 00:02:14 INFO mapred.JobClient:  map 18% reduce 0%
13/12/13 00:02:15 INFO mapred.JobClient:  map 19% reduce 0%
13/12/13 00:02:19 INFO mapred.JobClient:  map 20% reduce 0%
13/12/13 00:02:21 INFO mapred.JobClient:  map 21% reduce 0%
13/12/13 00:02:22 INFO mapred.JobClient:  map 22% reduce 0%
13/12/13 00:02:25 INFO mapred.JobClient:  map 23% reduce 0%
13/12/13 00:02:28 INFO mapred.JobClient:  map 24% reduce 0%
13/12/13 00:02:29 INFO mapred.JobClient:  map 25% reduce 0%
13/12/13 00:02:32 INFO mapred.JobClient:  map 26% reduce 0%
13/12/13 00:02:34 INFO mapred.JobClient:  map 27% reduce 0%
13/12/13 00:02:36 INFO mapred.JobClient:  map 28% reduce 0%
13/12/13 00:02:38 INFO mapred.JobClient:  map 29% reduce 0%
13/12/13 00:02:40 INFO mapred.JobClient:  map 30% reduce 0%
13/12/13 00:02:43 INFO mapred.JobClient:  map 31% reduce 0%
13/12/13 00:02:45 INFO mapred.JobClient:  map 32% reduce 0%
13/12/13 00:02:48 INFO mapred.JobClient:  map 33% reduce 0%
13/12/13 00:02:50 INFO mapred.JobClient:  map 34% reduce 0%
13/12/13 00:02:52 INFO mapred.JobClient:  map 35% reduce 0%
13/12/13 00:02:54 INFO mapred.JobClient:  map 36% reduce 0%
13/12/13 00:02:56 INFO mapred.JobClient:  map 37% reduce 0%
13/12/13 00:02:58 INFO mapred.JobClient:  map 38% reduce 0%
13/12/13 00:03:00 INFO mapred.JobClient:  map 39% reduce 0%
13/12/13 00:03:03 INFO mapred.JobClient:  map 40% reduce 0%
13/12/13 00:03:05 INFO mapred.JobClient:  map 41% reduce 0%
13/12/13 00:03:07 INFO mapred.JobClient:  map 42% reduce 0%
13/12/13 00:03:09 INFO mapred.JobClient:  map 43% reduce 0%
13/12/13 00:03:11 INFO mapred.JobClient:  map 44% reduce 0%
13/12/13 00:03:13 INFO mapred.JobClient:  map 45% reduce 0%
13/12/13 00:03:16 INFO mapred.JobClient:  map 46% reduce 0%
13/12/13 00:03:17 INFO mapred.JobClient:  map 47% reduce 0%
13/12/13 00:03:20 INFO mapred.JobClient:  map 48% reduce 0%
13/12/13 00:03:22 INFO mapred.JobClient:  map 49% reduce 0%
13/12/13 00:03:24 INFO mapred.JobClient:  map 50% reduce 0%
13/12/13 00:03:26 INFO mapred.JobClient:  map 51% reduce 0%
13/12/13 00:03:28 INFO mapred.JobClient:  map 52% reduce 0%
13/12/13 00:03:31 INFO mapred.JobClient:  map 53% reduce 0%
13/12/13 00:03:33 INFO mapred.JobClient:  map 54% reduce 0%
13/12/13 00:03:35 INFO mapred.JobClient:  map 55% reduce 0%
13/12/13 00:03:39 INFO mapred.JobClient:  map 56% reduce 0%
13/12/13 00:03:41 INFO mapred.JobClient:  map 57% reduce 0%
13/12/13 00:03:44 INFO mapred.JobClient:  map 57% reduce 11%
13/12/13 00:03:45 INFO mapred.JobClient:  map 58% reduce 11%
13/12/13 00:03:47 INFO mapred.JobClient:  map 58% reduce 12%
13/12/13 00:03:48 INFO mapred.JobClient:  map 59% reduce 12%
13/12/13 00:03:50 INFO mapred.JobClient:  map 60% reduce 12%
13/12/13 00:03:54 INFO mapred.JobClient:  map 61% reduce 12%
13/12/13 00:03:57 INFO mapred.JobClient:  map 61% reduce 13%
13/12/13 00:03:58 INFO mapred.JobClient:  map 62% reduce 13%
13/12/13 00:04:00 INFO mapred.JobClient:  map 63% reduce 13%
13/12/13 00:04:03 INFO mapred.JobClient:  map 64% reduce 13%
13/12/13 00:04:05 INFO mapred.JobClient:  map 65% reduce 13%
13/12/13 00:04:08 INFO mapred.JobClient:  map 66% reduce 13%
13/12/13 00:04:12 INFO mapred.JobClient:  map 67% reduce 14%
13/12/13 00:04:15 INFO mapred.JobClient:  map 68% reduce 14%
13/12/13 00:04:18 INFO mapred.JobClient:  map 69% reduce 14%
13/12/13 00:04:20 INFO mapred.JobClient:  map 70% reduce 14%
13/12/13 00:04:24 INFO mapred.JobClient:  map 70% reduce 15%
13/12/13 00:04:25 INFO mapred.JobClient:  map 71% reduce 15%
13/12/13 00:04:26 INFO mapred.JobClient:  map 72% reduce 15%
13/12/13 00:04:30 INFO mapred.JobClient:  map 73% reduce 15%
13/12/13 00:04:33 INFO mapred.JobClient:  map 74% reduce 15%
13/12/13 00:04:36 INFO mapred.JobClient:  map 75% reduce 15%
13/12/13 00:04:38 INFO mapred.JobClient:  map 76% reduce 15%
13/12/13 00:04:40 INFO mapred.JobClient:  map 76% reduce 16%
13/12/13 00:04:41 INFO mapred.JobClient:  map 77% reduce 16%
13/12/13 00:04:43 INFO mapred.JobClient:  map 78% reduce 16%
13/12/13 00:04:46 INFO mapred.JobClient:  map 79% reduce 16%
13/12/13 00:04:50 INFO mapred.JobClient:  map 80% reduce 16%
13/12/13 00:04:52 INFO mapred.JobClient:  map 80% reduce 17%
13/12/13 00:04:53 INFO mapred.JobClient:  map 81% reduce 17%
13/12/13 00:04:55 INFO mapred.JobClient:  map 82% reduce 17%
13/12/13 00:04:59 INFO mapred.JobClient:  map 83% reduce 17%
13/12/13 00:05:00 INFO mapred.JobClient:  map 84% reduce 17%
13/12/13 00:05:05 INFO mapred.JobClient:  map 85% reduce 17%
13/12/13 00:05:06 INFO mapred.JobClient:  map 86% reduce 17%
13/12/13 00:05:10 INFO mapred.JobClient:  map 87% reduce 18%
13/12/13 00:05:12 INFO mapred.JobClient:  map 88% reduce 18%
13/12/13 00:05:15 INFO mapred.JobClient:  map 89% reduce 18%
13/12/13 00:05:17 INFO mapred.JobClient:  map 90% reduce 18%
13/12/13 00:05:19 INFO mapred.JobClient:  map 90% reduce 19%
13/12/13 00:05:21 INFO mapred.JobClient:  map 91% reduce 19%
13/12/13 00:05:24 INFO mapred.JobClient:  map 92% reduce 19%
13/12/13 00:05:25 INFO mapred.JobClient:  map 93% reduce 19%
13/12/13 00:05:28 INFO mapred.JobClient:  map 94% reduce 19%
13/12/13 00:05:31 INFO mapred.JobClient:  map 95% reduce 19%
13/12/13 00:05:34 INFO mapred.JobClient:  map 95% reduce 20%
13/12/13 00:05:35 INFO mapred.JobClient:  map 96% reduce 20%
13/12/13 00:05:37 INFO mapred.JobClient:  map 97% reduce 20%
13/12/13 00:05:40 INFO mapred.JobClient:  map 98% reduce 20%
13/12/13 00:05:43 INFO mapred.JobClient:  map 99% reduce 20%
13/12/13 00:05:45 INFO mapred.JobClient:  map 100% reduce 20%
13/12/13 00:05:46 INFO mapred.JobClient:  map 100% reduce 21%
 

Can someone tell me what might be the problem? Thanks.

Updated on 2013-12-13T05:23:40Z at 2013-12-13T05:23:40Z by lilly0129
  • dvillegas
    dvillegas
    17 Posts
    ACCEPTED ANSWER

    Re: Hadoop MapReduce job randomly freezes in "BigInsights Basic 1.4 64b"

    ‏2013-12-13T19:39:57Z  in response to lilly0129

    When your job freezes like this, can you check the JobTracker UI at http://<jobtracker_url>:50030 ?

    There you should be able to find additional information by clicking on the task attempt's log.

    Other sources of information to help debug the problem are the jobtracker logs in $BIGINSIGHTS_VAR/hadoop/logs/*jobtracker*.log and $BIGINSIGHTS_VAR/hadoop/log/*tasktracker*.log

     

    Updated on 2013-12-13T19:40:25Z at 2013-12-13T19:40:25Z by dvillegas
    • lilly0129
      lilly0129
      2 Posts
      ACCEPTED ANSWER

      Re: Hadoop MapReduce job randomly freezes in "BigInsights Basic 1.4 64b"

      ‏2013-12-14T18:54:42Z  in response to dvillegas

      I checked JobTracker UI. It didn't give me userful information. In the summary page, everything seems OK, but it just freezed at reduce stage 21%

      Hadoop job_201312130100_0003 on vhost0150

      User: idcuser
      Job Name: wordcount
      Job File: hdfs://170.225.96.150:9000/user/idcuser/.staging/job_201312130100_0003/job.xml
      Submit Host: vhost0150.dc1.co.us.compute.ihost.com
      Submit Host Address: 170.225.96.150
      Job-ACLs: All users are allowed
      Job Setup: Successful
      Status: Running
      Started at: Sat Dec 14 13:38:51 EST 2013
      Running for: 12mins, 44sec
      Job Cleanup: Pending

       


       

       

      Kind % Complete Num Tasks Pending Running Complete Killed Failed/Killed
      Task Attempts
      map 100.00%
       
      401 0 0 401 0 0 / 0
      reduce 21.36%
         
      1 0 1 0 0 0 / 0

       

       

       

      When I click the reduce, it showed as follows:

      All Tasks

      Task Complete Status Start Time Finish Time Errors Counters
      task_201312130100_0003_r_000000 21.36%
         
      reduce > copy (257 of 401 at 0.01 MB/s) > 14-Dec-2013 13:40:34     12

       


      Then I have to manually cancel the job to stop this.

      Updated on 2013-12-15T00:50:04Z at 2013-12-15T00:50:04Z by lilly0129
      • dvillegas
        dvillegas
        17 Posts
        ACCEPTED ANSWER

        Re: Hadoop MapReduce job randomly freezes in "BigInsights Basic 1.4 64b"

        ‏2013-12-16T15:53:32Z  in response to lilly0129

        You can click on the reducer's task id, then go to "All" under "Task logs" to check if there's some info of why the task is not progressing.

        Also, as mentioned before, check the JobTracker's logs and then the TaskTracker's logs in the machine where the task was run (you can get that info after you click on the task's id, in the column "Machine")