What's the recommended way to scale down a cluster?

Hey there! :wave:

I’m currently looking at our AWS costs as they seem very high to me for our workload – happy to give more details in private. On the EKS side, our nodes seem under utilized to me, at least in term of usage (~10% CPU / RAM avg).

So I tried to scale down the cluster:

  • I lowered the “Desired” state of the Auto Scaling Group on AWS → It ended up being modified back (by Qovery, I guess?) a few minutes later, with two more nodes than originally, hehe…
  • I lowered the range in the Qovery console and triggered an update → The range and desired state in AWS are still higher (max is 2x higher).

Something like 15 minutes after these attempts, the cluster actually scaled down a bit, but I’m not sure why. :slight_smile:

Hence my question: what’s the recommended way to scale down a cluster?

(Side question: is there any way to check nodes resource allocation?)

Edit: Looks like the current autoscaling group range is the same than on Qovery now, so I guess it just took the 15 minutes to update on AWS?

1 Like

Hello !

I’ll try to answer step by step.

First, the major control plane is actually Qovery, every modification done cloud provider side will be override by Qovery. This time it’s ok but careful, a wrong manipulation provider side could lead to break your cluster or the link with the Qovery control plane.

I lowered the “Desired” state of the Auto Scaling Group on AWS → It ended up being modified back (by Qovery, I guess?) a few minutes later, with two more nodes than originally, hehe…

Because cluster autoscaler handle the desired size by itself regarding resource consumption.

I lowered the range in the Qovery console and triggered an update → The range and desired state in AWS are still higher (max is 2x higher).

Edit: Looks like the current autoscaling group range is the same than on Qovery now, so I guess it just took the 15 minutes to update on AWS?

When you modify nodes pool it take around 20 minutes to be effective, this could explain the difference between AWS UI and Qovery UI.

Something like 15 minutes after these attempts, the cluster actually scaled down a bit, but I’m not sure why.

When you change pool settings and a scale down is triggered, all pods on the future deleted node must be migrated to another one. Once it’s done, node can be deleted properly. This operation takes time.

Hence my question: what’s the recommended way to scale down a cluster?

The best way is to let the autoscaler do the job. For your information, the scale margin is 10% CPU, so if you’re node use 90%+ CPU a new node will be created. If pods running on a node can fit another one and leave it under 90% CPU usage, it will scaledown. This resource consumption is checked every minute and based on actual workload.

(Side question: is there any way to check nodes resource allocation?)

A feature coming in v3 will display all the informations you’ll need.In the meantime, if you’re using k9s you switch to node view :no then do ctrl+w if you don’t see resources columns. If you want more detail press enter on a node the same shorcuts to display pod’s resources usage.If you’re using kubectl and want a quick overview kubectl top nodes is what you need. For a more detailled response use kubectl describe nodes. Again, be careful, a wrong manipulation could break it all.

Hi @ramnes , plus I would add, maybe Qovery AWS EC2 would be a better and most cost-effective solution for you? (we don’t recommend it for production workload since it’s not resilient, but if EKS costs is an issue, it could be a good one)

it could be great for me aswel to test this :smiley:

2 Likes

TL;DR: I still don’t know why my cluster didn’t scale down by itself earlier, but at least I do know now that it can’t scale down more than it did yesterday because all nodes have a high CPU allocation.

That’s the whole point: I tried to scale down manually because nodes were using an average of 10% CPU and RAM. I wouldn’t have touched anything if the cluster was staying at the minimum amount of nodes as long as they’re under 90% CPU.

I assumed there was reasons on your side to keep more nodes alive, hence the post. If not, the real question is why my cluster doesn’t stay at the minimum amount of nodes when its CPU usage is 10%. :slight_smile:

Both k9s and kubectl top nodes show usage, not allocation. What I wanted to see is how much resource was still allocatable on my nodes.

I asked the question because:

  1. I saw resources allocation as another potential reason of my cluster not scaling down and
  2. from what I could remember, this information wasn’t available through kubectl;

but I’ve found in the meantime that kubectl describe node actually gives it!

And from its output, CPU requests / limits are very high: around 90% of allocatable CPU is requested on most nodes.And from its output, CPU requests / limits are very high: around 90% of allocatable CPU is requested on most nodes. So it explains why my cluster doesn’t scale down more. But not why it did not scale down earlier than when I tried manually.

FWIW, Qovery’ overall Kubernetes stack allocates between 12% and 61% of available CPU on each one my nodes, with an average of 34%, so there’s probably a bit of optimization left on your side. (For example, pods in the qovery namespace seem to stay idle most of the time, maybe they don’t need 200m each?)

But I guess I should first and foremost switch to bigger EC2 machines (to reduce the impact of Kubernetes’ overhead) and reduce CPU allocation on my applications a bit more, but that’s hard without CPU / RAM usage metrics over time. Any plan for a Vertical Pod Autoscaler feature within Qovery? :slight_smile:

1 Like

We’re full of AWS credits so I’m not trying to heavily cut costs; I just want to avoid blowing these credits unnecessarily. And I want the resilience. :slight_smile:

1 Like

Hey there!

We just updated our cluster to switch to bigger instances. The move was super smooth, thanks and congrats! :tada:

That being said, my goal was to lower the number of instances since we now have twice more CPU and RAM. But right now the cluster is stuck with 4 instances. I’ve checked the requests and limits, and everything would fit quite easily on 3 instances.

So I’m coming back to my very first question on this thread. :sweat_smile: Is there anyway to force the scale down?

Hello,
what I can already say is that a part resources are allocated to applications which manage the cluster (logs, auto-scaling, networking, metrics scraping…) and allocation are already optimized as much as we can.

The thing is allocation isn’t usage. If we reduce allocation of an app consuming less than allocated values to stick to the average usage it will end to issues when the app will have resource pressure. Autoscaling could fail, certificate couldn’t be generated or dns records couldn’t be done.

To answer to your question, you can but it’s not recommended since kube handles it for you thru auto-scaler. It watches if on a node deletion, all apps can fit to remaining nodes. If yes, future deleted node is drained then deleted.

If you still want to do it by your own, this answer could help.

I know what’s allocation and usage, and I know that you’ve already worked on reducing the overhead. :slight_smile:

My point is that the current setup seems too gentle. I didn’t look at all the configurations but the scheduler doesn’t seem to binpack, there doesn’t seem to be any rebalancing mechanism, and cluster-autoscaler doesn’t seem very eager either.

Our current cluster looks like this, and still, no node is unscheduled:

Node 1

  • CPU requests: 57%
  • RAM requests: 32%

Node 2

  • CPU requests: 61%
  • RAM requests: 25%

Node 3

  • CPU requests: 47%
  • RAM requests: 11%

Node 4

  • CPU requests: 95%
  • RAM requests: 97%

Allocated resources are used for scaling not requested ones.

If we take a look at your cluster and allocated resources:

  • Node 1: 99% CPU, 42% RAM
  • Node 2: 78% CPU, 26% RAM
  • Node 3: 41% CPU, 21% RAM
  • Node 4: 92% CPU, 98% RAM

That’s why it doesn’t scale down.

I thought cluster-autoscaler was only considering the requests (which AFAIK is true for the utilization threshold that triggers a scale down) but indeed, the target nodes do need to have enough allocatable space or the pods can’t be moved, thanks for the reminder. :slight_smile:

What’s the rationale for keeping high limits on loki, cert-manager-cainjector, qovery-cluster-agent and qovery-shell-agent? With the limits of these four pods only, 3.5 CPU are locked, meaning that we’re effectively paying a t3.xlarge just for them when they actually use no resource at all 99% of the time. Do they really need that amount of resources on peak usage? Seems weird compared to the 200m limit of nginx, for example. Genuinely asking, no bad feelings!

We choose those values because it ensures those apps will run smoothly in most cases. If one of them freeze it will cause a service interruption.