How Kubernetes CPU and RAM resources allocation works with Qovery?

Hey guys! Coming here for a bit of help on how to better think about configuring our infrastructure with Qovery.

Basically we have been having multiple issues regarding how to properly setup the resources, I’m not very sure how the resource limitation on Qovery works, but I have taken a look into our AWS account and we have 13 EC2 instances running and we currently have only 10 “apps” deployed with Qovery from which 4 are databases managed by our service providers, so basically we only have 6 containerized apps running.

We know that from the 13 EC2 instances that we are running at least 3 are unused from one of the clusters that we are expecting to be the production one; now 10 of them are t3.medium instances that are likely from our only used cluster, and it is so weird to me that each of them (having 2 vCPU’s and 4 GB RAM) we can setup as much as only 1 app node with 1 vCPU and 1 to 1.5GB RAM on the entire cluster (currently running 10 t3.medium instances, which is also crazy given how low our current load is).

Our current configuration for our main app is:

  • 1vCPU + 1 GB Memory (on production) – 1-3 nodes for scaling
  • 0.5 vCPU + 512MB Memory (on staging) – 1-2 nodes for scaling

Of these apps they are each on their own environment(but same cluster) and each environment has just 2 more apps each running on 0.5vCPU and 500MB ram with just 1 nodes for scaling. We can totally bring those configurations down given how low resource consumption they are, but still we have a lot of available resources to use which are not being used at all.

If I understand correctly the total sum of available resources for our cluster counting on all the t3.medium instances that are running on the AWS account is ~40GB Memory + 20 vCPU. Which is why for me is absolutely crazy that the limits for my apps is just 1 app instance with 1vCPU + 1GB of memory

How can we better calculate the resource distributions and stop having so much trouble with our app’s configuration?

Context
From the past week when we started using Qovery on both production and staging we have had apps freezing/halting several times due to misconfigurations on resources, and has become a frustrating problem not being able to understand how resources are distributed and how can we best configure it all, we really like the vision of just being able to deploy our app stages and preview-environments when needed so we can focus on writing the app code without much headache. So I’m coming here for some help on how to better understand this

Hi @ditorojuan, I ack your message and I will come back to you in a couple of hours with a clear and detailed explanation of how resource allocation works.

1 Like

@ditorojuan,

Here is an explanation of how Qovery resources allocation work and precisely how Kubernetes resources allocation work. Because, as you noticed. Qovery is a control plane managing Kubernetes clusters, but the CPU and RAM allocation relies on the Kubernetes scheduler. I think it’s important to clarify this.

Kubernetes resources allocation

Kubernetes is a resource scheduler across a pool of nodes (cluster). Basically, it just looks at all the nodes within your cluster, their state, and how much CPU and RAM they have (it’s a very simplistic view); look at how much you requested for your app and then assign a node to your application. Your application will consume some resources from this node. Here are some rules that Kubernetes applies to keep in mind:

  1. Your application’s total CPU can’t be larger than the total CPU of 1 node in your cluster.

In your example, you have a t3.medium node with 2 CPUs and 4GB RAM. It means that your application can’t allocate more than 2 CPU and 4GB RAM

  1. Actually, Kubernetes consumes some resources from your node - so you can’t allocate the total of your CPU and RAM from your node. A rule of thumb is that you can remove 0.1 CPU and 700 MB from your t3.medium node. Those values increase logarithmically with the size of node.

In your example, it means that you actually can max allocate 1.9 CPU and 3.3 GB RAM.

  1. In other words, you can pack a lot of applications in a node if you don’t exceed the max threshold of allocatable CPU and RAM.

In a t3.medium, you can add only 1 app that consumes 1.9 CPU and 3.3GB RAM or 10 apps consuming 0.19 CPU and 330 MB RAM for each.

So in your case, it’s normal to see one EC2 instance per application since your app consumes all the resources from this instance. Keep reading because I give some tips to understand how you can optimize this.

Autoscaling: How does Kubernetes scale-up work? :chart_with_upwards_trend:

You might have noticed that Qovery allows you to choose how many nodes (min and max) you want for your cluster. This is what we call auto-scaling. The principle is simple, if your app needs more resources and you don’t reach the max, Kubernetes will spin up a new node to handle your additional workload (like a new app that needs to be deployed).

EKS do not uses out-of-the-box Kubernetes auto-scaler; However, we install it on and make some optimizations. The auto-scaling up principle is easy to understand, and here again, it’s based on your CPU and Memory consumption and pressure. (I am not going to detail this here; but I can if you want).

Autoscaling: How does Kubernetes scale down work? :chart_with_downwards_trend:

This is where the default Kubernetes auto-scaler sucks (sorry). As you noticed, you have EC2 instances running for nothing that is not rebalanced or dropped by Kubernetes. It is a big issue that we are aware of and have nothing to do with Qovery but with Kubernetes itself. This is why we consider adding other autoscaling options like Karpenter to fix this over-consumption issue.

You can read more about this issue here.

How can you solve this issue?

@Pierre_Mavro do you have a solution to solve this over-allocation issue on Kubernetes temporarily?

Since now I’ve explained how Kubernetes autoscaling works, I can respond to your question

The best is to know exactly how much your application needs to work correctly and set the appropriate CPU and RAM. For that, you need to measure it. Do you today use an APM like Datadog or Newrelic? It’s better to measure to estimate precisely. We have a tutorial explaining how to install Datadog on your EKS cluster with Qovery.

I am sorry about this; as mentioned above, I would highly recommend using an APM to diagnose better what happened to your app in terms of CPU and RAM consumption. Knowing what causes resource starvation will help to adapt the amount of resources to allocate to your app and be alerted in case of an issue again.

I know how frustrating it is, and trust me, we’ll find solutions to make your life easier in the future of Qovery. It’s our obsession and full focus

1 Like

I love this response! Thank you very much for such a detailed answer, theres one quick question that i need an answer for and it is: is each environment on qovery equivalent to a node on kubernetes? So the calculation to be made is around each environment vs each application(application being whats inside an environment in this context)

A Qovery Environment is associated with a Kubernetes cluster (more precisely a Kubernetes cluster namespace); then, depending on how many applications and what you run in your environment, you will have one or more nodes from this Kubernetes cluster that will be involved.

Here is a quick schema showing how the allocation can be done between the nodes from your cluster.

1 Like

In some cases where too many resources are wasted over too many nodes, taking bigger nodes is a simple way to avoid this issue.

1 Like

Wow this image made it click for me so much. Now that theres an understanding of how things work I have been tweekin the resources and consumption without causing it to halt nor fail the following deployments.

Basically we have been limiting to less than 50% of the resources of each node for all the pods that were given a huge misconfiguration of resources basically I was assigning 20% of available RAM and 50% of vCPU from the available nodes, given this distribution we were running 1 pod per node but that pod had very low resources, so Kubernetes was running a bunch of nodes that whenever they saw a spike on RAM they triggered a reset and re-scaling but they were also resetting given our users were performing the same action causing the cluster to not know what to do with those nodes and having hitted the cluster nodes max number this lead to an eventual 504 on our frontend, so the production application was down.

PS: Ill come back at this to make this a better response as this might work as documentation for everyone <3

1 Like

Hey, since this is the thread to explain it all, here is a little case about resources consumption and autoscaling.

Here is the tool used to get all the values.

1 Like

If we are able to fit an application on one node or less, which would be better for auto-scaling, allocating additional nodes or more resources?
ex: Is allocating 5 t3.large instances better than having 3 t3.xlarge for application performance or vice versa?

Thanks!

Hello,
are you talking about pod autoscaling or node autoscaling ?

If you’re talking about pod autoscaling, what is the bottleneck ? Network ? CPU ? RAM ?

Hey @Enzo
Talking about node autoscaling. we are contemplating running our nodes either in a 3 t3.xlarge nodes or 5 t3.large nodes and would be great if you can share your opinion if spreading across EC2 nodes is better than sharing on larger node?

For us, our bottleneck is memory primarily, we do have periods of heavy use so we need to accommodate this by setting up a higher max for memory on our pods, but that prompts cluster to spin up additional EC2 instances.

Thanks!!!

So if I understand well, you manually increase the memory limit of the app when needed ?

If yes, you definitely should use the pod autoscaler. If your app struggle, a new pod is created. In that case you have two options regarding nodes:

  1. Use smaller node that fit perfectly your needs. The good is the cost, the bad is when a pod is created to release the memory pressure on the other(s), you’ll have to wait for a new node to be created.
  2. Use bigger node which are oversize. The good is no delay on pod creation, the bad is the cost since you have unconsumed resources.

Hey @Enzo , can you share any info on how I can setup POD autoscaler. I use terraforms to deploy our apps.

Thank you very much for your response!!!

Hey @sama213 ,
As shown in this example, you can set min_running_instances and max_running_instances.

If your app gets some CPU pressure it will scale up until it reaches the max running instances value and scale down when it gets better.

Hey @Enzo, just to confirm I read the documentation in this thread above correctly, Memory pressure will also cause a scale up. Then when back to normal Pods will be scaled down? Also, what is the threshold to cause a scale up?

@sama213 pod autoscaling is only triggered by CPU pressure. The Qovery default threshold is 60% but you can edit it thru applications advanced settings.

Hello @Enzo ,
I am seeing some inconsistencies between numbers showing for resources in qovery console for our application and actual resources we are allocating.

How can a CPU usage be marked at 53.0 while actual resources is 0.25 CPU?
(please see screenshot below).

I also ran the kubectl-view_allocations command against one of our clusters We have only allocated (looking at resources for each application) around a total of 1.5 CPU units for application and deployed a datadog agent , why is is showing that no CPU is available. we have 3 t3.large nodes so we should have at least 2 CPU unit available if my calculations are correct.

Link to our environment qovery console below:


Hi @sama213 , can you open another thread? This one was about Kubernetes resources allocation. Thank you :pray: