[AWS][Migration] Moving from NLB controller to ALB controller

a_carrano · August 12, 2024, 2:11pm

Only for customers using the Qovery Managed clusters on AWS

Dear customer,

We are excited to inform you that in the upcoming days, we will integrate the AWS Application Load Balancers (ALB) Controller into our product, adding more network control and features.

The rollout of this feature will require a migration on your cluster to remove the old Network Load Balancers controller (NLB), with a possible impact on your applications. Check the rest of the article for more information.

Context

At Qovery, we initially started with Kubernetes’ built-in Network Load Balancer controller (NLB). It was the best choice at the beginning of our company since it simplified a lot of things (If you are interested, we have described all the reasons in our blog post here).
Over the past weeks, we have been working to get rid of this legacy part and integrate the ALB controller.

We are not migrating from NLB to ALB, we will still be using NLB under the hood. What is changing is the Kubernetes controller that we use to manage the load balancers on AWS

Benefits of Activating the ALB Controller

Reduced Downtime: The ALB controller helps decrease the downtime for some applications during updates.
Improved IP Forwarding: The original IP addresses are forwarded directly to your application, rather than the load balancer’s IP, providing enhanced transparency and traceability.
We will soon add other functionalities that are available only on applications using the ALB controller.

ALB controller, the default choice for new clusters

The ALB controller feature will be enabled by default for all new clusters, ensuring that you benefit from its advantages right from the start.

Migrating an existing application to ALB

We encourage you to activate this feature as soon as you can to take advantage of the benefits listed above.
Since the switch creates a small downtime (see sections below), we will let you decide whenever you want to apply this change.
Test the switch on a dev/staging cluster before applying this change on your production cluster.

If no action is taken from you, we will force the migration to the ALB by the end of XXX ← UPDATE 21/10/2024: we won’t force the migration for now. We will send a separate communication to announce it but we strongly encourage you to start the migration by yourself.

If you have any questions or need assistance with the migration process, please do not hesitate to contact our support team or comment on this post.

Migration and Downtime

Activating the ALB controller involves a migration process with a maximum expected downtime of 10 minutes. This downtime is necessary because the current load balancer must be deleted and replaced as per AWS requirements. We strongly advise against enabling this advanced setting during your production hours to minimize any impact on your operations.

How to migrate

WARNING: as described above, a downtime is expected during this migration

Requirements for customers using custom VPCs (Qovery Managed VPC does not require these steps):

On public subnets: add a label kubernetes.io/role/elb with the value 1 to the subnet where the ALB will be created.
On private subnets: add a label kubernetes.io/role/internal-elb with the value 1 to the subnet where the ALB will be created.
On all subnets: add a label kubernetes.io/cluster/<cluster-name> with the value shared to the subnet where the ALB will be created.

Through the advanced settings of your cluster, activate the ALB by changing the value of the advanced settings aws.eks.enable_alb_controller to true.
Once the value is updated, redeploy your cluster to apply the change.
Once the cluster redeploy is completed, redeploy any application exposing a TCP/UDP port or your container database exposed publicly. All your services exposed on an HTTP port are automatically migrated and no action is needed on your side.

Note: if you have custom domains, you don’t have nothing especially to do, they will be automatically redirected to the new load balancer.

Thanks
Alessandro

jorgeramirezamora · September 12, 2024, 6:28pm

Hey @a_carrano,

Just tried the new aws.eks.enable_alb_controller advanced setting in our development cluster and it didn’t work. I rolledback to the default configuration and even tough our deployments are successful now I am getting these errors in the services with custom domains configured:

Currently all our services running in our development cluster are not reachable. How can I bring them back?

Jorge

bchastanier · September 12, 2024, 6:55pm

Hey @jorgeramirezamora,

Sorry for the confusion here, the today release has been postponed to Monday 09/16 (CF Standard cluster update - 09/12/2024).

For the time being, this flag doesn’t trigger anything behind the scene.
Can you link your service Qovery URL please so I can have a look?
The one which seems to be the one you shared is this one but is stopped.

Cheers

jorgeramirezamora · September 12, 2024, 7:14pm

Hey @bchastanier,

Thanks for the update. The services that are not reachable are these two:

Service 1

Service 2

I assume also others services we have in that same cluster are not reachable as well if we turn them on (We have most of them down currently).

Regards

Jorge

bchastanier · September 13, 2024, 8:12am

Hey @jorgeramirezamora,

I do see your domains validations green now, your services are stopped though, can you let me know if you still face the issue?

Cheers

jorgeramirezamora · September 13, 2024, 4:33pm

Hey @bchastanier,

I am sorry, our development services are down outside office time. Forgot to update that deployment rule so you could review… It seems that they are working now. Not sure if this morning deployment fixed it or if what just a matter or Route53 taking too long to update after rolling back to ALB.

Regards

Jorge

bchastanier · September 13, 2024, 4:45pm

Great to hear @jorgeramirezamora !
The ALB should be released this Monday, we will update once done.

Cheers

a_carrano · September 17, 2024, 8:59am

The flag has been released and you can now activate the ALB controller on your non-production clusters.

prki · October 4, 2024, 2:43pm

Is this now available to production clusters? When do you expect it to be?

a_carrano · October 7, 2024, 11:37am

Hi @prki ,

we are working on:

activating by default the ALB controller for new clusters
allowing you to enable the ALB controller on production clusters

we should deliver both the points above in the next sprint (2 weeks)

prki · October 21, 2024, 6:59am

Hi @a_carrano, it’s been 2 weeks, was this feature released yet?

a_carrano · October 21, 2024, 8:21am

Hi @prki,

sorry for not giving updates on this.

We had a few delay on the delivery so this has not been activated.

We should work on it next week, I will share an update here once we are ready.

prki · October 23, 2024, 8:58am

I see. We have a short window of opportunity for a maintenance with downtime and we would like to do the ALB migration. Otherwise, we may need to stay with NLB for a long time. This will not become forced migration eventually, right?

a_carrano · October 24, 2024, 3:14pm

There will be a forced migration but I don’t think it will happen before 2025. I’ll try to share the details as soon as possible

prki · November 6, 2024, 8:45am

Hi @a_carrano, do you have any update on the migration?

a_carrano · November 6, 2024, 1:23pm

Hi, not yet but it won’t happen before Q1 2025. We want to give enough time to all the customers to test it before and apply the change on their prod cluster

prki · November 6, 2024, 1:29pm

Sorry, I wasn’t clear. I was looking if the migration was available to trigger in production by ourselves not when it will be applied to all customers by Qovery.

a_carrano · November 8, 2024, 2:24pm

Hi @prki ,

you can now activate the ALB on production clusters! I’ll share an update here when we start forcing the update in Q1

Topic		Replies	Views
Standard cluster update - 09/12/2024 News	7	108	September 12, 2024
ALB instead of ELB Questions and Answers	2	336	March 25, 2024
Why provision NLBs for container databases? Deployment	16	964	March 25, 2024
Customize NGINX ingress and Kubernetes Deployment configuration Questions and Answers kubernetes , nginx	20	379	October 8, 2024
Using AWS WAF with Qovery Web Containers AWS qovery , aws	2	224	March 28, 2024

[AWS][Migration] Moving from NLB controller to ALB controller

Context

Benefits of Activating the ALB Controller

ALB controller, the default choice for new clusters

Migrating an existing application to ALB

Migration and Downtime

How to migrate

Related topics