Site with a status code 503 during deployments

During deployments we sometimes run into “503 Service Temporarily Unavailable” errors, which isn’t nice for the user.

Any idea why this happens? We would have expected that there is a rolling update happening, so new pods being created and the ones with older version then being shut down. But no service interruption. What can we do to mitigate this?

Hi @FlorianSuchan ,

You should not get a 503. The 503 is coming from your application or something else (E.g NGINX)? Can you post a screenshot or at least give the steps to reproduce it? cc @Pierre_Mavro

The rolling update is the strategy we use for deploying the new version of your app. You must not get any downtime. However, it can happen since Kubernetes relies on a probe to check that your app is up and running before routing the incoming traffic to the new version.

It’s possible that your app is seen as being ready while it’s not the case. If you give me the steps to reproduce the issue, I will give a shot.

Hi @rophilogene thanks for helping us, please see the screenshot attached.

Hi @FlorianSuchan ,

Sorry for the late answer. This happens because your application takes time to start and your port is open before the application is able to serve traffic.

In a near future, we’ll add an option to define a check mechanism (defined by the user) to ensure the application is ready to handle the incoming traffic.

In the meantime, I advise your to update your code and only open the port of your application when it’s ready to serve traffic. This way you will never encounter this kind of issue anymore.

Pierre

Hi @Pierre_Mavro ,

thanks for reaching out. I know of readiness/liveness probes on Kubernetes but where would I do that on a Rails app directly?

Can you share an ETA for these checks?

Best, Florian

Hi @FlorianSuchan ,

I’ve added similar issues to the Troubleshot documentation Troubleshoot | Docs | Qovery

Thanks

@Pierre_Mavro Since adding health checks on staging/production cluster you shipped some time ago, this is not an issue anymore :slight_smile:

2 Likes