[AWS][Migration] Moving from NLB controller to ALB controller

Only for customers using the Qovery Managed clusters on AWS

Dear customer,

We are excited to inform you that in the upcoming days, we will integrate the AWS Application Load Balancers (ALB) Controller into our product, adding more network control and features.

:warning: :warning: The rollout of this feature will require a migration on your cluster to remove the old Network Load Balancers controller (NLB), with a possible impact on your applications. Check the rest of the article for more information.:warning: :warning:

Context

At Qovery, we initially started with Kubernetes’ built-in Network Load Balancer controller (NLB). It was the best choice at the beginning of our company since it simplified a lot of things (If you are interested, we have described all the reasons in our blog post here).
Over the past weeks, we have been working to get rid of this legacy part and integrate the ALB controller.

:information_source: We are not migrating from NLB to ALB, we will still be using NLB under the hood. What is changing is the Kubernetes controller that we use to manage the load balancers on AWS

Benefits of Activating the ALB Controller

  • Reduced Downtime: The ALB controller helps decrease the downtime for some applications during updates.
  • Improved IP Forwarding: The original IP addresses are forwarded directly to your application, rather than the load balancer’s IP, providing enhanced transparency and traceability.
  • We will soon add other functionalities that are available only on applications using the ALB controller.

ALB controller, the default choice for new clusters

The ALB controller feature will be enabled by default for all new clusters, ensuring that you benefit from its advantages right from the start.

Migrating an existing application to ALB

We encourage you to activate this feature as soon as you can to take advantage of the benefits listed above.
Since the switch creates a small downtime (see sections below), we will let you decide whenever you want to apply this change.
Test the switch on a dev/staging cluster before applying this change on your production cluster.

If no action is taken from you, we will force the migration to the ALB by the end of XXX ← UPDATE 21/10/2024: we won’t force the migration for now. We will send a separate communication to announce it but we strongly encourage you to start the migration by yourself.

If you have any questions or need assistance with the migration process, please do not hesitate to contact our support team or comment on this post.

Migration and Downtime

Activating the ALB controller involves a migration process with a maximum expected downtime of 10 minutes. This downtime is necessary because the current load balancer must be deleted and replaced as per AWS requirements. We strongly advise against enabling this advanced setting during your production hours to minimize any impact on your operations.

How to migrate

:warning: WARNING: as described above, a downtime is expected during this migration :warning:

  1. Requirements for customers using custom VPCs (Qovery Managed VPC does not require these steps):
  • On public subnets: add a label kubernetes.io/role/elb with the value 1 to the subnet where the ALB will be created.
  • On private subnets: add a label kubernetes.io/role/internal-elb with the value 1 to the subnet where the ALB will be created.
  • On all subnets: add a label kubernetes.io/cluster/<cluster-name> with the value shared to the subnet where the ALB will be created.
  1. Through the advanced settings of your cluster, activate the ALB by changing the value of the advanced settings aws.eks.enable_alb_controller to true.

  2. Once the value is updated, redeploy your cluster to apply the change.

  3. Once the cluster redeploy is completed, redeploy any application exposing a TCP/UDP port or your container database exposed publicly. All your services exposed on an HTTP port are automatically migrated and no action is needed on your side.

Note: if you have custom domains, you don’t have nothing especially to do, they will be automatically redirected to the new load balancer.

Thanks
Alessandro

Hey @a_carrano,

Just tried the new aws.eks.enable_alb_controller advanced setting in our development cluster and it didn’t work. I rolledback to the default configuration and even tough our deployments are successful now I am getting these errors in the services with custom domains configured:

Currently all our services running in our development cluster are not reachable. How can I bring them back?

Jorge

Hey @jorgeramirezamora,

Sorry for the confusion here, the today release has been postponed to Monday 09/16 (CF Standard cluster update - 09/12/2024).

For the time being, this flag doesn’t trigger anything behind the scene.
Can you link your service Qovery URL please so I can have a look?
The one which seems to be the one you shared is this one but is stopped.

Cheers

Hey @bchastanier,

Thanks for the update. The services that are not reachable are these two:

Service 1

Service 2

I assume also others services we have in that same cluster are not reachable as well if we turn them on (We have most of them down currently).

Regards

Jorge

Hey @jorgeramirezamora,

I do see your domains validations green now, your services are stopped though, can you let me know if you still face the issue?

Cheers

Hey @bchastanier,

I am sorry, our development services are down outside office time. Forgot to update that deployment rule so you could review… It seems that they are working now. Not sure if this morning deployment fixed it or if what just a matter or Route53 taking too long to update after rolling back to ALB.

Regards

Jorge

1 Like

Great to hear @jorgeramirezamora !
The ALB should be released this Monday, we will update once done.

Cheers

1 Like

The flag has been released and you can now activate the ALB controller on your non-production clusters.

Is this now available to production clusters? When do you expect it to be?

Hi @prki ,

we are working on:

  1. activating by default the ALB controller for new clusters
  2. allowing you to enable the ALB controller on production clusters

we should deliver both the points above in the next sprint (2 weeks)

Hi @a_carrano, it’s been 2 weeks, was this feature released yet?

Hi @prki,

sorry for not giving updates on this.

We had a few delay on the delivery so this has not been activated.

We should work on it next week, I will share an update here once we are ready.

I see. We have a short window of opportunity for a maintenance with downtime and we would like to do the ALB migration. Otherwise, we may need to stay with NLB for a long time. This will not become forced migration eventually, right?

There will be a forced migration but I don’t think it will happen before 2025. I’ll try to share the details as soon as possible

Hi @a_carrano, do you have any update on the migration?

Hi, not yet but it won’t happen before Q1 2025. We want to give enough time to all the customers to test it before and apply the change on their prod cluster

Sorry, I wasn’t clear. I was looking if the migration was available to trigger in production by ourselves not when it will be applied to all customers by Qovery.

Hi @prki ,

you can now activate the ALB on production clusters! I’ll share an update here when we start forcing the update in Q1