Topic
  • 9 replies
  • Latest Post - ‏2019-01-10T17:13:59Z by christofschmitt
rpergamin
rpergamin
14 Posts

Pinned topic CES - SMB AD Autentication password reset failure

‏2017-03-13T12:36:52Z | active ad ces directory gpfs smb

Hi,

 

I have a CES Cluster with 2 x CES nodes serving SMB.

It is authenticated with an AD directory server..

Every week (& and now it has been after two weeks), so far the system looses the authentication, cause when attempting to reset password, it fails.
The errors seen in the logs are kinit preauthentication failed errors.

If I remove the authnetication (file) and re-add it all comes back to normal, until next time...

From AD side it looks ok, and the entity (netbios name given)  is capable in terms of permission to reset its password..

Anyone experienced such beavhour ?

Anyone used a disable password reset option like this:

http://simon.rozman.si/computers/virtual/auth-failure-after-reverting-to-snapshot
 

?


This cause a massive disruption to service on weekly basis..

THANKS !
 

  • christofschmitt
    christofschmitt
    18 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2017-03-13T18:26:43Z  

    The background here is that the CES cluster has a machine account to access Active Directory. The default is to change the password for the machine account every week. A problem has been recently fixed with the password change, that might have triggered the problem: https://bugzilla.samba.org/show_bug.cgi?id=12262

     

    A workaround can be disabling the machine account password change:

    /usr/lpp/mmfs/bin/net conf setparm global 'machine password timeout' 0

    although this should be removed again once the code has been fixed.

     

    I would suggest to report this through a PMR.

  • rpergamin
    rpergamin
    14 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2017-03-13T18:47:56Z  

    The background here is that the CES cluster has a machine account to access Active Directory. The default is to change the password for the machine account every week. A problem has been recently fixed with the password change, that might have triggered the problem: https://bugzilla.samba.org/show_bug.cgi?id=12262

     

    A workaround can be disabling the machine account password change:

    /usr/lpp/mmfs/bin/net conf setparm global 'machine password timeout' 0

    although this should be removed again once the code has been fixed.

     

    I would suggest to report this through a PMR.

    Thanks !
    I wonder if I am the only one to which this is happening ?
    Havent you seen it anywhere else ?

     

    Regards,

    Ran

     

  • christofschmitt
    christofschmitt
    18 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2017-03-13T19:57:12Z  
    • rpergamin
    • ‏2017-03-13T18:47:56Z

    Thanks !
    I wonder if I am the only one to which this is happening ?
    Havent you seen it anywhere else ?

     

    Regards,

    Ran

     

    It is a sporadic problem. I have seen one case of a lost machine account password, but that did not have sufficient data to determine the exact cause.

  • rpergamin
    rpergamin
    14 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2017-03-14T04:36:21Z  

    It is a sporadic problem. I have seen one case of a lost machine account password, but that did not have sufficient data to determine the exact cause.

    Thanks !

    /usr/lpp/mmfs/bin/net conf setparm global 'machine password timeout' 0

    do I need to run it from every single CES node or just from one of them ?

  • christofschmitt
    christofschmitt
    18 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2017-03-14T17:02:31Z  
    • rpergamin
    • ‏2017-03-14T04:36:21Z

    Thanks !

    /usr/lpp/mmfs/bin/net conf setparm global 'machine password timeout' 0

    do I need to run it from every single CES node or just from one of them ?

    No, this automatically updates the config that is kept in a database used by all protocol nodes

    As mentioned earlier, this should not be the permanent solution, but rather a workaround.

  • rpergamin
    rpergamin
    14 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2017-03-20T12:49:26Z  

    No, this automatically updates the config that is kept in a database used by all protocol nodes

    As mentioned earlier, this should not be the permanent solution, but rather a workaround.

    One week later, despite the param change & The connection still shows healthy using "mmhealth" but the logs clearly show that its failing.
     
    Can you Please advise ASAP, what should we do ?
     
     
     
    Jaffa01:
     
    [2017/03/20 11:03:36.992898,  1] ../source3/winbindd/winbindd_pam.c:1439(winbind_samlogon_retry_loop)
      winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED.  Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain BGTEHILA
    [2017/03/20 11:03:37.229387,  1] ../source3/rpc_client/cli_pipe.c:421(cli_pipe_validate_current_pdu)
      ../source3/rpc_client/cli_pipe.c:421: Bind NACK received from host vm-dcbgsdmz.bgtehila.COM!
    [2017/03/20 11:03:37.245312,  1] ../source3/rpc_client/cli_pipe.c:3316(cli_rpc_pipe_open_schannel_with_creds)
     cli_rpc_pipe_open_schannel_with_creds: rpc_pipe_bind failed with error NT_STATUS_NETWORK_ACCESS_DENIED
    [2017/03/20 11:27:52.088002,  1] ../source3/libsmb/trusts_util.c:264(trust_pw_change)
      2017/03/20 11:27:52 : trust_pw_change(BGTEHILA): Changed password locally
    [2017/03/20 11:27:52.096767,  0] ../source3/libsmb/trusts_util.c:272(trust_pw_change)
      2017/03/20 11:27:52 : trust_pw_change(BGTEHILA) remote password change set failed - NT_STATUS_WRONG_PASSWORD
    [2017/03/20 14:26:13.585981,  0] ../source3/libads/kerberos_util.c:74(ads_kinit_password)
      kerberos_kinit_password DBCAMNAS$@BGTEHILA.COM failed: Preauthentication failed
    [2017/03/20 14:26:13.586033,  1] ../source3/winbindd/winbindd_ads.c:136(ads_cached_connection_connect)
      ads_connect for domain BGTEHILA failed: Preauthentication failed
    [

  • christofschmitt
    christofschmitt
    18 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2017-08-02T20:34:45Z  
    • rpergamin
    • ‏2017-03-20T12:49:26Z

    One week later, despite the param change & The connection still shows healthy using "mmhealth" but the logs clearly show that its failing.
     
    Can you Please advise ASAP, what should we do ?
     
     
     
    Jaffa01:
     
    [2017/03/20 11:03:36.992898,  1] ../source3/winbindd/winbindd_pam.c:1439(winbind_samlogon_retry_loop)
      winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED.  Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain BGTEHILA
    [2017/03/20 11:03:37.229387,  1] ../source3/rpc_client/cli_pipe.c:421(cli_pipe_validate_current_pdu)
      ../source3/rpc_client/cli_pipe.c:421: Bind NACK received from host vm-dcbgsdmz.bgtehila.COM!
    [2017/03/20 11:03:37.245312,  1] ../source3/rpc_client/cli_pipe.c:3316(cli_rpc_pipe_open_schannel_with_creds)
     cli_rpc_pipe_open_schannel_with_creds: rpc_pipe_bind failed with error NT_STATUS_NETWORK_ACCESS_DENIED
    [2017/03/20 11:27:52.088002,  1] ../source3/libsmb/trusts_util.c:264(trust_pw_change)
      2017/03/20 11:27:52 : trust_pw_change(BGTEHILA): Changed password locally
    [2017/03/20 11:27:52.096767,  0] ../source3/libsmb/trusts_util.c:272(trust_pw_change)
      2017/03/20 11:27:52 : trust_pw_change(BGTEHILA) remote password change set failed - NT_STATUS_WRONG_PASSWORD
    [2017/03/20 14:26:13.585981,  0] ../source3/libads/kerberos_util.c:74(ads_kinit_password)
      kerberos_kinit_password DBCAMNAS$@BGTEHILA.COM failed: Preauthentication failed
    [2017/03/20 14:26:13.586033,  1] ../source3/winbindd/winbindd_ads.c:136(ads_cached_connection_connect)
      ads_connect for domain BGTEHILA failed: Preauthentication failed
    [

    >   2017/03/20 11:27:52 : trust_pw_change(BGTEHILA) remote password change set failed - NT_STATUS_WRONG_PASSWORD

     

    Does Active Directory refuse password changes for the account? In that case might apply https://bugzilla.samba.org/show_bug.cgi?id=12782

    Please open a PMR for debugging and once the problem is confirmed, we can provide a fix.

  • jschmiedt
    jschmiedt
    1 Post

    Re: CES - SMB AD Autentication password reset failure

    ‏2019-01-10T11:17:29Z  

    Hi,

     

    We're having a similar problem, probably also related to password changing. We're running Spectrum Scale 5.0.2 with two CES servers, which intermittently lose authentication. The logs then show something like this:

    [2019/01/04 11:58:58.236347,  0] ../source3/libsmb/trusts_util.c:564(trust_pw_change)
      2019/01/04 11:58:58 : trust_pw_change(ESI): Finished password change.
    [2019/01/04 11:58:58.237703,  0] ../source3/libsmb/trusts_util.c:617(trust_pw_change)
      2019/01/04 11:58:58 : trust_pw_change(ESI): Verified new password remotely using netlogon_creds_cli:CLI[PNS/PNS$]/SRV[ESI-SVDC001/ESI]
    [2019/01/04 12:14:26.044624,  0] ../source3/libads/kerberos_util.c:74(ads_kinit_password)
      kerberos_kinit_password PNS$@ESI.LOCAL failed: Preauthentication failed
    [2019/01/04 12:14:26.044716,  1] ../source3/libads/sasl.c:802(ads_sasl_spnego_bind)
      ads_sasl_spnego_gensec_bind(KRB5) failed for ldap/esi-svdc020.esi.local with user[PNS$] realm[ESI.LOCAL]: Preauthentication failed, fallback to NTLMSSP
    [2019/01/04 12:14:26.154126,  1] ../source3/libads/sasl.c:821(ads_sasl_spnego_bind)
      ads_sasl_spnego_gensec_bind(NTLMSSP) failed for ldap/esi-svdc020.esi.local with user[PNS$] realm=[ESI.LOCAL]: Invalid credentials
    [2019/01/04 12:14:26.154166,  1] ../source3/libads/ldap_utils.c:111(ads_do_search_retry_internal)
      ads_search_retry: failed to reconnect (Invalid credentials)
    [2019/01/04 12:57:06.465616,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 1000 to 500 due to IO_TIMEOUT
    [2019/01/04 13:49:32.355648,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT
    [2019/01/04 17:45:25.322343,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT
    [2019/01/06 04:22:55.276781,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 1000 to 500 due to IO_TIMEOUT
    [2019/01/06 06:27:22.566603,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT
    [2019/01/06 07:02:44.485658,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT
    [2019/01/07 03:37:18.709867,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 1000 to 500 due to IO_TIMEOUT
    [2019/01/07 04:22:55.604085,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT
    [2019/01/07 07:07:23.645610,  1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
      Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT
    [2019/01/08 09:00:53.796596,  1] ../source3/rpc_client/cli_pipe.c:421(cli_pipe_validate_current_pdu)
      ../source3/rpc_client/cli_pipe.c:421: Bind NACK received from host ESI-svDC013.ESI.local!
    [2019/01/08 12:07:08.390406,  1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop)
      winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED.  Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI
    [2019/01/08 12:07:08.460801,  1] ../source3/rpc_client/cli_pipe.c:421(cli_pipe_validate_current_pdu)
      ../source3/rpc_client/cli_pipe.c:421: Bind NACK received from host ESI-svDC001.ESI.local!
    [2019/01/09 11:14:25.138047,  1] ../libcli/smb/tstream_smbXcli_np.c:138(tstream_smbXcli_np_destructor)
      tstream_smbXcli_np_destructor: cli_close failed on pipe netlogon. Error was NT_STATUS_INTERNAL_ERROR
    [2019/01/09 11:15:32.075707,  1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop)
      winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED.  Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI
    [2019/01/09 11:15:32.122374,  1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop)
      winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED.  Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI
    [2019/01/09 11:15:32.343039,  1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop)
      winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED.  Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI
    [2019/01/09 11:15:32.472602,  1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop)
    

     

    The only way to fix this problem has been to remove authentication and add it again. It seems to occur more often when a DC is unresponsive, e.g. due to a restart. Has this problem been actually fixed?

     

  • christofschmitt
    christofschmitt
    18 Posts

    Re: CES - SMB AD Autentication password reset failure

    ‏2019-01-10T17:13:59Z  
    • jschmiedt
    • ‏2019-01-10T11:17:29Z

    Hi,

     

    We're having a similar problem, probably also related to password changing. We're running Spectrum Scale 5.0.2 with two CES servers, which intermittently lose authentication. The logs then show something like this:

    <pre dir="ltr">[2019/01/04 11:58:58.236347, 0] ../source3/libsmb/trusts_util.c:564(trust_pw_change) 2019/01/04 11:58:58 : trust_pw_change(ESI): Finished password change. [2019/01/04 11:58:58.237703, 0] ../source3/libsmb/trusts_util.c:617(trust_pw_change) 2019/01/04 11:58:58 : trust_pw_change(ESI): Verified new password remotely using netlogon_creds_cli:CLI[PNS/PNS$]/SRV[ESI-SVDC001/ESI] [2019/01/04 12:14:26.044624, 0] ../source3/libads/kerberos_util.c:74(ads_kinit_password) kerberos_kinit_password PNS$@ESI.LOCAL failed: Preauthentication failed [2019/01/04 12:14:26.044716, 1] ../source3/libads/sasl.c:802(ads_sasl_spnego_bind) ads_sasl_spnego_gensec_bind(KRB5) failed for ldap/esi-svdc020.esi.local with user[PNS$] realm[ESI.LOCAL]: Preauthentication failed, fallback to NTLMSSP [2019/01/04 12:14:26.154126, 1] ../source3/libads/sasl.c:821(ads_sasl_spnego_bind) ads_sasl_spnego_gensec_bind(NTLMSSP) failed for ldap/esi-svdc020.esi.local with user[PNS$] realm=[ESI.LOCAL]: Invalid credentials [2019/01/04 12:14:26.154166, 1] ../source3/libads/ldap_utils.c:111(ads_do_search_retry_internal) ads_search_retry: failed to reconnect (Invalid credentials) [2019/01/04 12:57:06.465616, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 1000 to 500 due to IO_TIMEOUT [2019/01/04 13:49:32.355648, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT [2019/01/04 17:45:25.322343, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT [2019/01/06 04:22:55.276781, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 1000 to 500 due to IO_TIMEOUT [2019/01/06 06:27:22.566603, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT [2019/01/06 07:02:44.485658, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT [2019/01/07 03:37:18.709867, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 1000 to 500 due to IO_TIMEOUT [2019/01/07 04:22:55.604085, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT [2019/01/07 07:07:23.645610, 1] ../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal) Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT [2019/01/08 09:00:53.796596, 1] ../source3/rpc_client/cli_pipe.c:421(cli_pipe_validate_current_pdu) ../source3/rpc_client/cli_pipe.c:421: Bind NACK received from host ESI-svDC013.ESI.local! [2019/01/08 12:07:08.390406, 1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop) winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED. Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI [2019/01/08 12:07:08.460801, 1] ../source3/rpc_client/cli_pipe.c:421(cli_pipe_validate_current_pdu) ../source3/rpc_client/cli_pipe.c:421: Bind NACK received from host ESI-svDC001.ESI.local! [2019/01/09 11:14:25.138047, 1] ../libcli/smb/tstream_smbXcli_np.c:138(tstream_smbXcli_np_destructor) tstream_smbXcli_np_destructor: cli_close failed on pipe netlogon. Error was NT_STATUS_INTERNAL_ERROR [2019/01/09 11:15:32.075707, 1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop) winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED. Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI [2019/01/09 11:15:32.122374, 1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop) winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED. Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI [2019/01/09 11:15:32.343039, 1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop) winbind_samlogon_retry_loop: sam_logon returned ACCESS_DENIED. Maybe the DC has Restrict NTLM set or the trust account password was changed and we didn't know it. Killing connections to domain ESI [2019/01/09 11:15:32.472602, 1] ../source3/winbindd/winbindd_pam.c:1437(winbind_samlogon_retry_loop) </pre>

     

    The only way to fix this problem has been to remove authentication and add it again. It seems to occur more often when a DC is unresponsive, e.g. due to a restart. Has this problem been actually fixed?

     

    The machine account password is changed on the domain controller ESI-SVDC001

    Subsequently, access to other domain controllers fails. This could be related to the password change not being replicated immediately to other DCs.

    Another question: Does the Active Directory environment contain read-only domain controllers?

    In any case, i would be worthwhile to raise this as a support ticket, then we can debug this in more detail to come to a solution.

    Christof