IBM Support

Lost heartbeat on node

Troubleshooting


Problem

Lost heartbeat on one node of a cluster

Symptom

From ambari-agent.log

WARNING 2016-05-16 18:31:30,276 base_alert.py:140 - [Alert][ams_metrics_monitor_process] Unable to execute alert. Unable to find 'AMBARI_METRICS/package/alerts/alert_ambari_metrics_monitor.py' as an absolute path or part of /var/lib/ambari-agent/cache/stacks or /var/lib/ambari-agent/cache/host_scripts

ERROR 2016-05-16 18:31:40,013 Controller.py:340 - Unable to reconnect to https://<server>:8441/agent/v1/heartbeat/<server> (attempts=10, details=local variable 'data' referenced before assignment)

ERROR 2016-05-10 20:56:48,446 Controller.py:185 - Unable to connect to: https://<server>:8441/agent/v1/register/dx1090-dat.cluster.local
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 130, in registerWithServer
data = json.dumps(self.register.build(self.version))
File "/usr/lib/python2.6/site-packages/ambari_simplejson/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 202, in encode
chunks = list(chunks)
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 426, in _iterencode
for chunk in _iterencode_dict(o, _current_indent_level):
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 400, in _iterencode_dict
for chunk in chunks:
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 400, in _iterencode_dict
for chunk in chunks:
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 400, in _iterencode_dict
for chunk in chunks:
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 323, in _iterencode_list
for chunk in chunks:
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 382, in _iterencode_dict
yield _encoder(value)
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 48, in py_encode_basestring_ascii
s = s.decode('utf-8')
File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x8c in position 59147: invalid start byte

ERROR 2016-05-10 20:56:48,446 Controller.py:186 - Error:'utf8' codec can't decode byte 0x8c in position 59147: invalid start byte
WARNING 2016-05-10 20:56:48,446 Controller.py:187 - Sleeping for 29 seconds and then trying again

[{"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Component":"Web Console","Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.1.0","Edition":"","Line of Business":{"code":"LOB76","label":"Data Platform"}}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Document Information

Modified date:
08 April 2021

UID

swg21984577