What to monitor?

It's best to monitor the following components:

  • MongoDB - following MongoDB best practices.
  • RabbitMQ - following RabbitMQ best practices.
  • Docker service - on all hosts.
  • Docker api-manager container - check that the api-manager container is running.
  • Docker api-manager - a API endpoint for monitoring the status is available at /api/status, consult the api docs for more info.
  • Docker api-worker container - check that the api-manager container is running on all worker nodes.
  • (Optional) routing layers - changes depending on your design.
  • App containers - check that the app containers are running on your worker nodes - if your Dockerfile has healthcheck configure Nebula will automatically restart containers marked as "unhealthy".
  • End2End network connections - if your app accepts HTTP\TCP\UDP requests best to check e2e connectivity as well.

Another helpful tip is that it's possible to know the status of a deployment to the worker nodes by checking their RabbitMQ queue, as each worker only ACK a message after it completed deploying it a queue will only be empty of messages if the worker have processed all changes & is matching the required configuration for that app.