Why do some service instances not start in the 40K SI environment?

In a 40K service instance environment, when the EGO service controller (egosc) starts thousands of service instances at the same time, some service instances fail to start with the following warnings reported in the log files:
  • EGO service controller (esc) log:
    On host hostname, the container <containerID> belong to instance <instanceID> of service <serviceName> terminated, reason <Execution username or password is incorrect. Either wrong information is provided or the execution host is not set up properly>, status <0>.
  • Process execution manager (pem) log:
    startContainer(): execution account <execUser> is non-existent for activity <containerID>
    setupContainer(): failed to set execution id for activity <containerID>

In a 40K SI environment, when egosc starts all the service instances, all pems call getpwnam() to get the password for the execution user, while some getpwnam() fail even when the execution credentials are correct. This issue occurs because of Network Information Service (NIS) limitations, for which there is no workaround at present.