Adding a Windows worker node to the IBM Cloud Private cluster

As a technology preview, you can add a Windows™ worker node to an existing IBM Cloud Private cluster. Afterward, you can deploy a Windows application to the Windows node.

Important: This content is a technical preview, and should not be relied on in a production environment.

System requirements

Review the following system requirements:

Hardware requirements:

Table 1. Minimum hardware requirements
Node type Number CPU Memory Disk
Worker node >=1 2 >=4 GB >50 GB

Supported operating systems and platforms

Table 2. Supported operating systems
Platform Operating system
Windows Windows Server version 1803, 1809

Supported Docker version and container type

Table 3. Supported Docker versions
Platform Docker EE Container type
Windows 17.06.1-ee-2 or later Windows server containers

Supported features

Table 4. Supported features
Feature Windows worker node
Deploying Windows application in IBM Cloud Private cluster Y
Containerd N
IBM Cloud Private Cloud Foundry N
Cloud Automation Manager N
IBM Cloud Private-CE Y
Installation N
IPSec N
Logging N
Metering N
Monitoring
Prometheus
N
Networking: Calico N
Networking: NSX-T N
Nvidia GPU support N
Storage: GlusterFS N
Storage: VMware N
Storage: Minio N
Volume encryption N
Vulnerability advisor N

Prerequisites

Planning your network topology

There are several supported network configurations with Kubernetes on Windows. For more information, see Guide for adding Windows Nodes in Kubernetes Windows Opens in a new tab.

Note: To minimize the impact to existing IBM Cloud Private cluster networks, you must verify and support that the Host-Gateway is used as the network solution for Windows to integrate with IBM Cloud Private.

Disabling Calico IP-in-IP

  1. Install the Calico CLI. For details, see Installing the Calico CLI (calicoctl).
  2. To obtain the current IP pool (ippool) specification of the environment, run the following commands on the master node:

     export ETCD_ENDPOINTS=https://<MASTERIP>:4001
     export ETCD_CERT_FILE=/etc/cfc/conf/etcd/client.pem
     export ETCD_KEY_FILE=/etc/cfc/conf/etcd/client-key.pem
     export ETCD_CA_CERT_FILE=/etc/cfc/conf/etcd/ca.pem
    
     calicoctl get ippool default-ipv4-ippool -o yaml > ippool.yaml
     cat ippool.yaml
    

    The contents of the ippool.yaml file is shown in the following example:

     apiVersion: projectcalico.org/v3
     kind: IPPool
     metadata:
     creationTimestamp: 2019-01-28T16:46:29Z
     name: default-ipv4-ippool
     resourceVersion: "94911"
     uid: 42d2e92c-231c-11e9-837b-000c295cba9c
     spec:
     cidr: 10.1.0.0/16
     ipipMode: Always
     natOutgoing: true
    
  3. To disable the ipip mode, change Always to Never, and then run the following command to apply the change:

     calicoctl apply -f ./ippool.yaml
    

Preparing the Windows worker node

  1. Install Docker on the Windows server:

     Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
     Install-Package -Name docker -ProviderName DockerMsftProvider -Force
    

    For more information, see the Microsoft documentation Opens in a new tab.

  2. Restart the Windows host:

     Restart-Computer -Force
    
  3. After the host is running, start the Docker service:

     Start-Service docker
    
  4. Verify the Docker installation. For example:

     docker version
         Client:
         Version:           18.03.1-ee-4
         API version:       1.37
         Go version:        go1.10.2
         Git commit:        0ded23c
         Built:             Thu Oct 25 00:41:52 2018
         OS/Arch:           windows/amd64
         Experimental:      false
         Server:
         Engine:
         Version:          18.03.1-ee-4
         API version:      1.37 (minimum version 1.24)
         Go version:       go1.10.2
         Git commit:       0ded23c
         Built:            Thu Oct 25 00:56:17 2018
         OS/Arch:          windows/amd64
         Experimental:     false
    
  5. Enable IP forwarding on the Windows worker:

     PS C:\Users\Administrator> reg add HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v IPEnableRouter /D 1 /f
    
  6. Restart the Windows host:

     Restart-Computer -Force
    
  7. Configure SSH without a password from the Windows worker node to the master node:

     ssh-keygen -b 4096 -f $Env:UserProfile\.ssh\id_rsa -N '""'
    
     function ssh-copy-id([string]$userAtMachine){
     $publicKey = "$ENV:USERPROFILE" + "/.ssh/id_rsa.pub"
     if (!(Test-Path "$publicKey")){
         Write-Error "ERROR: failed to open ID file '$publicKey': No such file"
     } else {
         & cat "$publicKey" | ssh $userAtMachine "umask 077; test -d .ssh || mkdir .ssh ; cat >> .ssh/authorized_keys || exit 1"
     }
     }
    
     ssh-copy-id -i $Env:UserProfile\.ssh\id_rsa.pub root@<ICPMasterIp>
    

    Note: Replace the values that are surrounded by angle brackets (< >) according to your environment.

  8. Download the latest IBM Cloud Private 3.2.0 for Windows (64-bit) Docker package. You can download the file from the IBM Passport Advantage® Opens in a new tab website.

  9. Extract the package:

     Expand-Archive \Path\TO\ibm-cloud-private-win-x64-3.2.0.zip C:\
    

Adding the Windows worker node to the cluster

  1. Obtain a valid podCIDR for the Windows worker.

    1. Obtain the reserved pod CIDR range that is used for the Linux node on the master. For example:

        # ip route show | grep -Eo '[0-9.]+/[0-9]+'|grep "^10.1\."
        10.1.13.128/26
        10.1.120.64/26
        10.1.130.0/26
        10.1.140.0/26
      

      In this example, 10.1 is the prefix of the clusterCIDR parameter. Refer to Table 4 for the definition of clusterCIDR.

    2. Determine an unreserved pod CIDR for the Windows worker based on the previous result. For example, 10.1.141.0/26.

  2. Obtain the KubeDnsServiceIp for the cluster:

     # kubectl get svc -n kube-system |grep kube-dns
     kube-dns                       ClusterIP   10.0.0.10      <none>        53/UDP,53/TCP       87m
    

    10.0.0.10 is the value for KubeDnsServiceIp

  3. Start the following script to add the node, specifying the correct parameter values:

     cd C:\ibm-cloud-private-win-x64-3.2.0\
    
     .\join.ps1 -masterIp <ICPMasterIp> -clusterCIDR <ClusterCidr> -serviceCIDR <ServiceCidr> -kubeDnsServiceIp <KubeDnsServiceIp> -podCIDR <PodCIDR> -license accept
    

    Note: Run Get-Help .\join.ps1 to use this script. Refer to Table 5 for the parameter definitions.

  4. Verify the results by running the following command on the master node:

    # kubectl get node
     NAME          STATUS   ROLES          AGE     VERSION
     172.16.200.184   Ready    etcd,master   127m   v1.12.4+icp-ee
     172.16.200.208   Ready    worker        89m    v1.12.4+icp-ee
     172.16.200.210   Ready    proxy         89m    v1.12.4+icp-ee
     172.16.200.239   Ready    management    89m    v1.12.4+icp-ee
     shags1        Ready    worker         29m     v1.12.3
    

    Note that shags1 is the Windows node.

  5. Configure the static IP routes on the cluster nodes:

    • For the Windows host, add a route for the Linux pod CIDR to the Linux private IP.

      For example:

       route -p add 10.1.13.128/26   172.16.200.184
       route -p add 10.1.120.64/26   172.16.200.208
       route -p add 10.1.130.0/26    172.16.200.210
       route -p add 10.1.140.0/26    172.16.200.239
      

      In this example, 172.16.xx.xx is the IP of a Linux cluster node.

      The pod CIDR on each Linux node can be obtained by using the following command:

       ip route show |grep blackhole
      
    • For the Linux host, add a route for the Windows pod CIDR to the Windows private IP.

      For example:

       ip route add 10.1.141.0/26 via 172.16.215.209
      

      In this example, 172.16.215.209 is the IP of the Windows node. The value 10.1.141.0/26 is the value that is used in Step 1.

Table 5. Windows worker node specifications
Parameter Description Default Value
masterIp IBM Cloud Private master IP address.
clusterCIDR This is a global subnet that is used by all pods in the cluster. Each node is assigned a smaller /24 subnet from this for their pods to use. It is equal to network_cidr that is defined in config.yaml. 10.1.0.0/16
serviceCIDR A non-routable, purely virtual subnet that is used by pods to uniformly access services regardless of the network topology. It is converted to and from a routable address space by kube-proxy running on the nodes. It is equal to service_cluster_ip_range that is defined in config.yaml. 10.0.0.0/16
kubeDnsServiceIp IP address of the "kube-dns" service that is used for DNS resolution and cluster service discovery. You can get its value from Step 2. 10.0.0.10
podCIDR This is the unreserved subnet from the Calico IP pool to allocate IPs to the individual containers.
license IBM Cloud Private license agreement. accept

Deploying a sample Windows service

  1. Allow the Windows image to be deployed from the IBM Cloud Private image policy:

     kubectl edit ClusterImagePolicy -n kube-system
    

    Add the following value to the repositories section:

         - name: mcr.microsoft.com/windows/*
    
  2. Run the web server application to create the deployment:

     # wget https://raw.githubusercontent.com/Microsoft/SDN/master/Kubernetes/flannel/l2bridge/manifests/simpleweb.yml -O win-webserver.yaml
    
     # kubectl apply -f win-webserver.yaml
    

    It takes a few minutes to pull the Windows server core image. After deployment, two pods are in the running status. For example:

     # kubectl get po -o wide
     NAME                             READY   STATUS    RESTARTS   AGE   IP            NODE          NOMINATED NODE
     win-webserver-578967f9d4-6t2sx   1/1     Running   0          88m   10.1.141.46   shags1        <none>
     win-webserver-578967f9d4-vcwqx   1/1     Running   0          88m   10.1.141.45   shags1        <none>
    

    Two containers are started on the Windows server. For example:

     # docker ps |findstr powershell
     cb51f5d23630        17b224ab9b3a            "powershell.exe -com???"   About an hour ago   Up About an hour                        k8s_windowswebserver_win-webserver-578967f9d4-vcwqx_default_bb079250-2062-11e9-9104-00163e01c8b8_0
     8de0cbb092b8        17b224ab9b3a            "powershell.exe -com???"   About an hour ago   Up About an hour                        k8s_windowswebserver_win-webserver-578967f9d4-6t2sx_default_bb089f7e-2062-11e9-9104-00163e01c8b8_0
    
  3. Verify the pod and service:

    • Find the pod IP:

        # kubectl get po -o wide
        NAME                             READY   STATUS    RESTARTS   AGE   IP            NODE          NOMINATED NODE
        win-webserver-578967f9d4-6t2sx   1/1     Running   0          88m   10.1.141.46   shags1        <none>
        win-webserver-578967f9d4-vcwqx   1/1     Running   0          88m   10.1.141.45   shags1        <none>
      
    • Access the sample Windows application by the pod IP:

        # curl 10.1.141.45:80
            <html><body><H1>Windows Container Web Server</H1><p>IP 10.1.141.45 callerCount 4 <p>IP 10.1.141.45 callerCount 1 </body></html>
      
    • Find the service IP:

        # kubectl get svc
        NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
        kubernetes      ClusterIP   10.0.0.1       <none>        443/TCP        4h45m
        win-webserver   NodePort    10.0.193.104   <none>        80:31700/TCP   94m
      
    • Verify access by the service IP:

        # curl 10.0.193.104
            <html><body><H1>Windows Container Web Server</H1><p>IP 10.1.141.45 callerCount 5 <p>IP 10.1.141.45 callerCount 1 </body></html>
      

Known issues and limitations

  1. All cluster nodes, including the Windows node must be in same subnet.
  2. The Windows container cannot access the Internet.
  3. Cannot access Windows application with service type of NodePort because the node port of the service is inaccessible from the cluster node.

Troubleshooting

The following problems are identified, and resolutions are available:

Error thrown when installing Docker

Symptoms:

PS C:\Users\Administrator> Install-Package -Name docker -ProviderName DockerMsftProvider -Force
WARNING: A restart is required to enable the containers feature. Please restart your machine.
Install-Package : Cannot rename because item at 'C:\Program Files\dummyName' does not exist.
At line:1 char:1
+ Install-Package -Name docker -ProviderName DockerMsftProvider -Force
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (Microsoft.Power....InstallPackage:InstallPackage) [Install-Package],
   Exception
    + FullyQualifiedErrorId : InvalidOperation,Microsoft.PowerShell.Commands.RenameItemCommand,Microsoft.PowerShell.Pa
   ckageManagement.Cmdlets.InstallPackage

Cause:

This is a known Windows server 2019 issue. For more information, see the MicrosoftDockerProvider issue Opens in a new tab.

Resolving the problem:

When you see this error, you can ignore it and continue the procedure.

Joining the cluster failed

Symptoms:

PLAY [Join Windows to ICP cluster] *******************************************************

TASK [Label the node shags1 as worker role]
Error from server (NotFound): nodes "shags1" not found

Cause:

The master node cannot recognize the Windows node.

Resolving the problem:

You can add the Windows IP and hostname into /etc/hosts of Linux nodes. Then, run rm c:k\ on the PowerShell of Windows, and then rejoin the nodes.