Our application ran out of memory and crashed. However, it restarted and began running right away again. There was something briefly under Overview → Pods → Current Status, but it disappeared quickly after the service restarted. It doesn’t seem that there is really a way to go look at the cause or view any metrics in the logs in Qovery. Is anything related to the failure stored in Qovery logs stored somewhere? I can’t find anything stored in the logs.
During OOM issues, a SIGKILL is sent to the application / pod and kube automatically restarts it so no info would be available on kube side.
To troubleshoot this kind issue, you could use a monitoring service such as datadog / newrelic (you can follow those instructions to install datadog using qovery helm chart feature)