Recently working with an internal server we ran into an issue where going to /socail failed. All calls would not respond and eventually we got a timeout from the http server. We did quite a bit of troubleshooting and I thought it would be good to document all that here. 1 so you could see how we resolved the issue, but 2 so you can have an idea how we troubleshoot the issue in case these things help you with different issues with orient me environment.
The first thing we did was look at the orient me logs while reproducing the issue (going to /social in a browser), to do this we ran the following commands on the orient me server:
kubectl get pods
this will give you a list of pods, there will be 3 that start with orient-webclient- , get the exact name of each, then run this command on each one in different command windows
kubectl logs orient-webclient-2180930634-d6k12 -f --tail=5
NOTE: --tail=5 is the number of previous lines to display, this will tail the pods logs and show you the results
Once you are tailing the pods, then reproduce the issue and watch the logs to see what's happening. In our case we saw the following message in the logs
2017-06-27T15:54:37.942Z - [32minfo[39m: [orient-web-client] ENTRY: AuthenticationMiddleware.ensureAuthenticated undefined
2017-06-27T15:54:37.942Z - [32minfo[39m: [orient-web-client] ENTRY: AuthenticationMiddleware.handleAuthentication undefined
2017-06-27T15:54:37.943Z - [32minfo[39m: [orient-web-client] ENTRY: AuthenticationMiddleware.extractLDAPCookies undefined
2017-06-27T15:54:37.944Z - [32minfo[39m: [orient-web-client] EXIT: AuthenticationMiddleware.extractLDAPCookies { subscriberid: undefined,
userid: undefined,
displayName: undefined,
organizationId: undefined }
2017-06-27T15:54:37.944Z - [34mdebug[39m: [auth-service] auth-service: no JWT found, rejecting call with error: no_jwt_token
2017-06-27T15:54:37.945Z - [34mdebug[39m: [auth-service] >auth.service: about to call setJWT to generate or re-generate token...
2017-06-27T15:54:37.946Z - [34mdebug[39m: [auth-service] Getting user profile with url: https://connections.domain.com/connections/opensocial/rest/people/@me/@self
So this was the last line of the trace. It's a call from orient me to the connections server, but from the trace it appears we never get a response.
The first thing I did was test that url in a browser - this worked fine, and it simply returned information about the logged in user.
Next we reproduced the issue, and looked at the access logs on the connections http server, at the time of this message, we didn't see any request to /connections/opensocial/rest/people/@me/@self in the http logs. So this was the issue, for some reason the call to the connections server was not getting to the http server.
Next we attempted to run a curl command to see if we could connect over http, I did this from the pod itself
First connect to the pod:
kubectl exec -it <pod_name> bash
then run:
curl -X GET http://connections.domain.com
This failed to connect, so next I tried to do the same thing from the host machine of the orient me server, on the host, it also failed to connect
At this point I tried to determine why that was, first we tried to ping the connections hostname, host orient me is running on by running
ping connections.domain.com
and
nslookup connections.domain.com
This returned a valid ipaddress, but in this environment connections.domain.com has 2 valid ipaddresses. One is an internal address, the other an external address. The orient me machine was resolving to the external address, and the firwall was blocking calls from orient me to the external ip of connections.domain.com.
issuing curl -X GET http://connections_internal_ip_address connected fine.
At this point I looked at the /etc/resolve.conf on the orient me machine, and found that it was using the dns server with the external address, we update this to the internal address, restarted the machine itself, and the issue went away.
Additional steps debugging connectivity issues like this can be found here:
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
Additional information about configuring private dns zones for kubernetes can be found here, which should also help in this specific case
http://blog.kubernetes.io/2017/04/configuring-private-dns-zones-upstream-nameservers-kubernetes.html
Anyway, I hope the details of our troubleshooting are helpful and help you resolve similar issues in the future.