Topic
  • 9 replies
  • Latest Post - ‏2013-02-19T21:42:15Z by aelfner
aelfner
aelfner
47 Posts

Pinned topic DHCP handshake failing

‏2013-02-07T05:30:12Z |
Kernel Service VM build fails on DHCP handshake, resulting in timeout. New VM on Storage-1 issues DHCPDISCOVER, PXE picks up, issues DHCPOFFER back, but new VM doesn't seem to see it and continue with the DHCPREQUEST.

PXE is a Wkstn-9 VM talking on eth0, Storage-1 is a Wkstn-9 VM talking on br0. iptables irrelevant (same result on or off).

tcpdump logs attached (as 1 file due to limitations of this wiki)

Appreciate any insight.
Updated on 2013-02-19T21:42:15Z at 2013-02-19T21:42:15Z by aelfner
  • SystemAdmin
    SystemAdmin
    92 Posts

    Re: DHCP handshake failing

    ‏2013-02-07T13:36:47Z  
    How many NICs have the PXE ?
  • aelfner
    aelfner
    47 Posts

    Re: DHCP handshake failing

    ‏2013-02-07T16:39:26Z  
    How many NICs have the PXE ?
    Just one.
  • SystemAdmin
    SystemAdmin
    92 Posts

    Re: DHCP handshake failing

    ‏2013-02-08T10:51:11Z  
    Hi,
    it seems more a network issue probably related to the VMware Workstation 9 configuration.
    Have the 2 VMs the same network configured ?
  • aelfner
    aelfner
    47 Posts

    Re: DHCP handshake failing

    ‏2013-02-08T17:12:12Z  
    Hi,
    it seems more a network issue probably related to the VMware Workstation 9 configuration.
    Have the 2 VMs the same network configured ?
    They're setup up the same way they were with SCP 1.2. Attachment shows interfaces and routes.
  • SystemAdmin
    SystemAdmin
    92 Posts

    Re: DHCP handshake failing

    ‏2013-02-11T15:12:53Z  
    • aelfner
    • ‏2013-02-08T17:12:12Z
    They're setup up the same way they were with SCP 1.2. Attachment shows interfaces and routes.
    In the screenshot I see that they pxe and the storage-1 seem to have 2 different default gateway and/or dns.
    Could you check if it is true ?
  • aelfner
    aelfner
    47 Posts

    Re: DHCP handshake failing

    ‏2013-02-14T03:54:09Z  
    In the screenshot I see that they pxe and the storage-1 seem to have 2 different default gateway and/or dns.
    Could you check if it is true ?
    Hi Pino,

    I've set the routes and /etc/resolv.conf's the same, and removed an extraneous interface on storage-1. Still the DHCPOFFER from PXE is not being seeen by the test program (dhtest) or a started KVM vm. iptables is off on both sides, traceroute shows direct connections (1 hop). I am stumped. I ran dhcpdump, but it just shows me greater detail of the requests/offers, but not WHY the server offer is not seen by the client.

    The appliance works, but of course, that is a bunch of KVM vms on top of a single WS9 vm, versus this where I have either 2 WS9 vms or 1 WS9 vm talking to a KVM vm on top of a WS9 vm. Either way, this should be simple IP chatter if routing is correct - just can't figure it out.

    Appreciate any help.
  • SystemAdmin
    SystemAdmin
    92 Posts

    Re: DHCP handshake failing

    ‏2013-02-17T19:30:45Z  
    There are a few things you need to be aware of when running SCP (or any hypervisor-based system) within VMware Workstation. From the screenshots you appear to be running Linux, so you need to make sure that promiscuous mode is enabled on the /dev/vmnet adapter that you are using. There are many ways to do this, but for a quick test you can run the following command to change the settings. Note: You will need to power off all VM's and shut down VMWS before doing this.

    chmod 777 /dev/vmnet*

    Secondly, the vmxnet3 adapter has a habit of dropping UDP packets on occasion (documented on the VMware KB). As a result, services such as DHCP/DNS/TFTP can experience issues (much like what you are seeing). Ensure that for all of your VM's (especially the compute/storage nodes) the e1000 adapter type is specified in the vmx. Without this you will find that PXE-booting is unreliable and the installation process will typically fail.

    Finally, ensure you haven't left a firewall active on the Firstbox system. As the first 1024 ports are typically blocked on RedHat for incoming traffic you could be hitting issues there as well. Run 'iptables -L' to see what rules you have active, and use 'service ip6tables stop ; service iptables stop' to shut the firewall down.

    It is possible to do what you are describing (I do it all the time). Let us know how you get on.
  • SystemAdmin
    SystemAdmin
    92 Posts

    Re: DHCP handshake failing

    ‏2013-02-17T19:35:56Z  
    I forgot to mention (not strictly related).. Ensure in VMWS9 that you have the nested virtualization support enabled otherwise the build of the Kernel Service virtual machines will fail. You can enable this either through the UI or by editing the VMX and adding the following line:

    vhv.enable = "TRUE"

    You should also ensure that where possible (i.e if your hardware is new enough) you change the virtualization engine to use VT and EPT, not just VT and not automatic. You will need this for the nested support.
  • aelfner
    aelfner
    47 Posts

    Re: DHCP handshake failing

    ‏2013-02-19T21:42:15Z  
    I forgot to mention (not strictly related).. Ensure in VMWS9 that you have the nested virtualization support enabled otherwise the build of the Kernel Service virtual machines will fail. You can enable this either through the UI or by editing the VMX and adding the following line:

    vhv.enable = "TRUE"

    You should also ensure that where possible (i.e if your hardware is new enough) you change the virtualization engine to use VT and EPT, not just VT and not automatic. You will need this for the nested support.
    Stuart,

    Just the other day I thought of the promiscuous mode requirement and noticed it was no longer set. I had updated my /etc/init.d/vmware-workstation-server to include the chmod over a year ago, but that script must have been overwritten at some point. Got a chance to try things out today and the install works fine. As always, very much appreciate your help and insight,

    Axel