[BUG] Cannot delete a cluster that has no environment

Francois · June 11, 2024, 3:04pm

Hello,

We are trying to delete our staging cluster. It shows the following error:

Screenshot 2024-06-11 at 15.49.49

But we don’t have any environment running except Production.

URL: https://console.qovery.com/organization/7b2c7fcd-6cc9-4c61-8647-8bf81e4fcda9/clusters/general

Francois · June 12, 2024, 3:10pm

Note: as we can’t delete the cluster, we are currently paying for a non-used cluster + all the AWS resources like RDS

Melvin_Zottola · June 12, 2024, 3:44pm

Hello @Francois,

I’m taking a look

Melvin_Zottola · June 12, 2024, 3:54pm

You can delete your cluster now, it should be good.
There was an environment still present “staging-new” but it has been marked as deleted (every service inside has been deleted). As the environment was marked as “DELETED” we don’t display it anymore in the console.
This is an issue we should resolve in the coming days.

Francois · June 13, 2024, 10:18am

Thanks!

Now the cluster deletion is broken & stuck…
It has been 1h30 since I started it

Screenshot 2024-06-13 at 12.17.58

Logs say: Infrastructure 'Staging cluster AWS-DEV (zace6887d)' deletion is in progress...

Melvin_Zottola · June 13, 2024, 10:26am

Hello @Francois,

On deletion, we delete every namespace inside your cluster.
Looking at it, there seems to be some custom components installed that are waiting for finalizers, e.g here it is stuck with argocd:

Francois · June 13, 2024, 10:30am

Thanks for the clear explanations, will check it out

Francois · June 13, 2024, 12:33pm

Solved, and cluster deleted
Thanks @Melvin_Zottola

For anyone reading this: on cluster deletion, the namespace got emptied of its controllers before the controller had time to properly clean its resources of its finalisers. Causing this “stuck loop” ^^

For Argocd, controllers got deleted before the Application Custom resources.
For KEDA, it was a bit more complicated as external.metrics.k8s.io/v1beta1 was missing.

More info about it here: The Hidden Dangers of Terminating Namespaces

Melvin_Zottola · June 13, 2024, 12:35pm

Thanks very much for the feedback @Francois

Francois · June 13, 2024, 12:45pm

I maybe ~talked~ wrote a bit too fast

There were only a few pods left in 1 namespace (kube-system), so I assumed the delete button would just clean up the empty cluster and clicking it was enough:

Screenshot 2024-06-13 at 14.42.50

But it turned out it is now installing more pods (aws-node), kube-proxy, etc
And CoreDNS is failing to start: MountVolume.SetUp failed for volume "config-volume" : configmap "coredns" not found.

(Not sure why it tries to install more things while trying to delete? Maybe it does a last apply before the destroy?)

Melvin_Zottola · June 13, 2024, 12:55pm

Indeed the “delete” action does a re-install + destroy.
So you should see the cluster being destroy after everything is well created on kube side

Melvin_Zottola · June 13, 2024, 12:57pm

From your cluster delete logs, the terraform destroy is running so it is now in the destroy phase

Francois · June 13, 2024, 3:34pm

Hum it seems stuck still

No nodes are being created to run coredns

Francois · June 17, 2024, 2:48pm

We had some resources installed as well in the Qovery VPC on our AWS account that blocked the cluster deletion (a redis).

Cluster is now deleted

Closing the thread

system · June 24, 2024, 2:48pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can not delete environment nor cluster in the account AWS	2	456	January 5, 2023
Cannot delete my environment and my cluster Questions and Answers	6	437	March 25, 2024
Error 403 on trying to delete cluster AWS	8	490	June 15, 2022
Can't delete environment Questions and Answers	7	306	January 18, 2024
Cannot delete the cluster Questions and Answers	5	16	August 8, 2024

[BUG] Cannot delete a cluster that has no environment

Related topics