Container databases are not reliable

We deploy container based Postgres databases via Qovery. However we occasionally get these errors:

Invalid `prisma.user.findUnique()` invocation:

Can't reach database server at `z3addcc21-postgresql`:`5432`

Please make sure your database server is running at `z3addcc21-postgresql`:`5432`.

It doesn’t appear to be related to deployment order because our databases deploy first and we also get these errors in long running environments.

Hello @prki,

Container DB are indeed not production ready and not aiming to, you should use it knowing that.
Everytime the DB pod has to be moved by Kubernetes from one node to another (changing node type, down sizing the cluster, etc.) your DB will have a small downtime, it’s by design unfortunately.
If your services are not confortable with that, you should use a managed DB which are well suited to target 0 ish downtime.

Cheers

We are only using it for staging and preview environments but even there we would like more reliability. I guess it could be explained by a lot of scaling up and down activity due to PEs being created and destroyed.

Yes, indeed, but as I said, it’s by design unfortunately and there is nothing we can do for the time being, not sure how I can help. It’s not a bug per say.

To be discussed internally but we can eventually add a PDB for those pods avoid those to be moved too often, but it can also clash with autoscaling and node optimization. So it should probably end up as an advanced settings.
cc @a_carrano

@bchastanier I was expecting there would be a PDB for it but I get the reason why you don’t set up one.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.