Topic
  • 3 replies
  • Latest Post - ‏2013-02-15T19:08:54Z by jlerm
jlerm
jlerm
15 Posts

Pinned topic Unable to remove data nodes

‏2013-02-13T23:05:14Z |
I'm trying to remove two data nodes, out of a three-node BI 2.0 cluster.
I've updated hadoop-conf/hdfs-site.xml, setting dfs.replication.min=1 and dfs-replication=1.

However, when I try to remove a node with "removenode.sh hadoop host51", for instance, it complains that the number of remaining slaves is less than the number of expected HDFS replicas (3).

I restarted the cluster after updating hdfs-site.xml several times, but the same error persists.

I also ran "hadoop dfs -setrep -w 1 /", to make sure all files are set to 1 replica, with no luck.

Is there any other file to be updated, or any other step to be followed?

Thanks,

Julius


Updated on 2013-02-15T19:08:54Z at 2013-02-15T19:08:54Z by jlerm
  • SystemAdmin
    SystemAdmin
    603 Posts

    Re: Unable to remove data nodes

    ‏2013-02-14T06:21:57Z  
    Hi Julius,

    When you are modifying Hadoop configuration files within BigInsights install, please consult this page for more details and the proper instructions.

    http://pic.dhe.ibm.com/infocenter/bigins/v2r0/topic/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/SynchConfigComp.html?resultof=%22%73%79%6e%63%63%6f%6e%66%22%20

    Going back to your original question. Here is your checklist:

    What roles or components are running on "host51"? Is it only Hadoop (datanode and tasktracker) or there are more?
    After you have followed the exact procedure I have given above to modify and propagate your Hadoop configuration changes, are you still have issue removing your datanode?

    Outside of the these questions, it is not normal or typical to remove datanodes below the minimal HDFS replica count. You will have an increase risk of data loss.

    Hope this helps.
    Jim
  • jlerm
    jlerm
    15 Posts

    Re: Unable to remove data nodes

    ‏2013-02-14T15:59:42Z  
    Hi Julius,

    When you are modifying Hadoop configuration files within BigInsights install, please consult this page for more details and the proper instructions.

    http://pic.dhe.ibm.com/infocenter/bigins/v2r0/topic/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/SynchConfigComp.html?resultof=%22%73%79%6e%63%63%6f%6e%66%22%20

    Going back to your original question. Here is your checklist:

    What roles or components are running on "host51"? Is it only Hadoop (datanode and tasktracker) or there are more?
    After you have followed the exact procedure I have given above to modify and propagate your Hadoop configuration changes, are you still have issue removing your datanode?

    Outside of the these questions, it is not normal or typical to remove datanodes below the minimal HDFS replica count. You will have an increase risk of data loss.

    Hope this helps.
    Jim
    Thanks a lot Jim, that helped!
  • jlerm
    jlerm
    15 Posts

    Re: Unable to remove data nodes

    ‏2013-02-15T19:08:54Z  
    Hi Julius,

    When you are modifying Hadoop configuration files within BigInsights install, please consult this page for more details and the proper instructions.

    http://pic.dhe.ibm.com/infocenter/bigins/v2r0/topic/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/SynchConfigComp.html?resultof=%22%73%79%6e%63%63%6f%6e%66%22%20

    Going back to your original question. Here is your checklist:

    What roles or components are running on "host51"? Is it only Hadoop (datanode and tasktracker) or there are more?
    After you have followed the exact procedure I have given above to modify and propagate your Hadoop configuration changes, are you still have issue removing your datanode?

    Outside of the these questions, it is not normal or typical to remove datanodes below the minimal HDFS replica count. You will have an increase risk of data loss.

    Hope this helps.
    Jim
    I forgot to address your point regarding the minimal replica count.
    I was trying this out on my own laptop, with a set of VMs.
    I had installed BigInsights on 3 VMs, and I wanted to remove the two data nodes so I don't have to bring them all up for the next things I want to work on. Basically I no longer need a distributed configuration, a single node is enough.
    I understand that this should never be done in a real world environment.

    Thanks a lot,

    Julius