Service is down - Error CrashLoopBackOff

All my services went down (test and prod) and APIs are returning 503 from nginx.

One of our clusters wasn’t marked as Production for some reason.

We did not change anything from our side and the service is down after Qovery updated the cluster this morning for some reason.

It seems the issue was resolved by changing the application port from 80 to 8080.
No errors were shown during deployment saying there is an issue with the port. The deployment was successful, but the application was not working.

We are having the same issue. I tried the port change described below, but no success.

Looking at the Deployment logs, the pods try to launch and go into error. After several retries the logs describes a Helm update error.

Application at commit XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX deployment is in progress :hourglass_flowing_sand:, below the current status:
:twisted_rightwards_arrows: Kubernetes load balancer XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX is READY

:artificial_satellite: Application has 2 pods. 0 starting, 0 terminating and 2 in error
┃ |__ Pod XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX is FAILING crash loop, pod is restarting too frequently. Look into your application logs
┃ |__ :anger: Pod crashed 8 times
┃ |__ :warning: Back-off restarting failed container
┃ |__ :information_source: Container image “XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX” already present on machine
┃ |__ :information_source: Created container XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
┃ |__ Pod XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX is FAILING crash loop, pod is restarting too frequently. Look into your application logs
┃ |__ :anger: Pod crashed 6 times
┃ |__ :warning: Back-off restarting failed container
┃ |__ :information_source: Container image “XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX” already present on machine
┃ |__ :information_source: Created container XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


:rescue_worker_helmet: Need Help ? Please consult our FAQ to troubleshoot your deployment Troubleshoot | Docs | Qovery and visit the forum https://discuss.qovery.com/
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Helm command UPGRADE for release Error on XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX helm upgrade terminated with an error: CommandError { full_details: Some(“Helm timed out for release XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX during helm UPGRADE: Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Command still running. No output available. Waiting for next line…Error: UPGRADE FAILED: an error occurred while rolling back the release. original upgrade error: timed out waiting for the condition: release XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX failed: timed out waiting for the conditionrelease XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX failedhelm.sh/helm/v3/pkg/action.(*Rollback).performRollback\thelm.sh/helm/v3/pkg/action/rollback.go:215helm.sh/helm/v3/pkg/action.(*Rollback).Run\thelm.sh/helm/v3/pkg/action/rollback.go:79helm.sh/helm/v3/pkg/action.(*Upgrade).failRelease\thelm.sh/helm/v3/pkg/action/upgrade.go:478helm.sh/helm/v3/pkg/action.(*Upgrade).reportToPerformUpgrade\thelm.sh/helm/v3/pkg/action/upgrade.go:343helm.sh/helm/v3/pkg/action.(*Upgrade).releasingUpgrade\thelm.sh/helm/v3/pkg/action/upgrade.go:402runtime.goexit\truntime/asm_amd64.s:1371an error occurred while rolling back the release. original upgrade error: timed out waiting for the conditionhelm.sh/helm/v3/pkg/action.(*Upgrade).failRelease\thelm.sh/helm/v3/pkg/action/upgrade.go:479helm.sh/helm/v3/pkg/action.(*Upgrade).reportToPerformUpgrade\thelm.sh/helm/v3/pkg/action/upgrade.go:343helm.sh/helm/v3/pkg/action.(*Upgrade).releasingUpgrade\thelm.sh/helm/v3/pkg/action/upgrade.go:402runtime.goexit\truntime/asm_amd64.s:1371UPGRADE FAILEDmain.newUpgradeCmd.func2\thelm.sh/helm/v3/cmd/helm/upgrade.go:202github.com/spf13/cobra.(*Command).execute\tgithub.com/spf13/cobra@v1.2.1/command.go:856github.com/spf13/cobra.(*Command).ExecuteC\tgithub.com/spf13/cobra@v1.2.1/command.go:974github.com/spf13/cobra.(*Command).Execute\tgithub.com/spf13/cobra@v1.2.1/command.go:902main.main\thelm.sh/helm/v3/cmd/helm/helm.go:87runtime.main\truntime/proc.go:225runtime.goexit\truntime/asm_amd64.s:1371: Command terminated with a non success exit status code: exit status: 1”), message_safe: “Error while executing Helm command.” }
:x: Deployment of Application failed ! Look at the report above and to understand why.
:rescue_worker_helmet: Need Help ? Please consult our FAQ to troubleshoot your deployment Troubleshoot | Docs | Qovery and visit the forum https://discuss.qovery.com/
:bomb: Deployment failed
Qovery Engine has terminated the deployment

It would appear that our issue stemmed from the Docker image and config we were using. It worked on K8S 1.23 but would not work on 1.24.

More specifically there were some issues with Apache permissions which affected port binding. We also leveraged the s6-overlay in our Docker config.