[BUG] Cannot delete a cluster that has no environment

Hello,

We are trying to delete our staging cluster. It shows the following error:

Screenshot 2024-06-11 at 15.49.49

But we don’t have any environment running except Production.

URL: https://console.qovery.com/organization/7b2c7fcd-6cc9-4c61-8647-8bf81e4fcda9/clusters/general

Note: as we can’t delete the cluster, we are currently paying for a non-used cluster + all the AWS resources like RDS :smiling_face_with_tear:

Hello @Francois,

I’m taking a look

1 Like

You can delete your cluster now, it should be good.
There was an environment still present “staging-new” but it has been marked as deleted (every service inside has been deleted). As the environment was marked as “DELETED” we don’t display it anymore in the console.
This is an issue we should resolve in the coming days.

1 Like

Thanks!

Now the cluster deletion is broken & stuck… :sweat_smile:
It has been 1h30 since I started it

Screenshot 2024-06-13 at 12.17.58

Logs say: Infrastructure 'Staging cluster AWS-DEV (zace6887d)' deletion is in progress...

Hello @Francois,

On deletion, we delete every namespace inside your cluster.
Looking at it, there seems to be some custom components installed that are waiting for finalizers, e.g here it is stuck with argocd:

1 Like

Thanks for the clear explanations, will check it out :slight_smile:

Solved, and cluster deleted :slight_smile: :tada:
Thanks @Melvin_Zottola :pray:

For anyone reading this: on cluster deletion, the namespace got emptied of its controllers before the controller had time to properly clean its resources of its finalisers. Causing this “stuck loop” ^^

For Argocd, controllers got deleted before the Application Custom resources.
For KEDA, it was a bit more complicated as external.metrics.k8s.io/v1beta1 was missing.

More info about it here: The Hidden Dangers of Terminating Namespaces

1 Like

Thanks very much for the feedback @Francois :pray:

I maybe ~talked~ wrote a bit too fast :sweat_smile:

There were only a few pods left in 1 namespace (kube-system), so I assumed the delete button would just clean up the empty cluster and clicking it was enough:

Screenshot 2024-06-13 at 14.42.50

But it turned out it is now installing more pods (aws-node), kube-proxy, etc
And CoreDNS is failing to start: MountVolume.SetUp failed for volume "config-volume" : configmap "coredns" not found.

(Not sure why it tries to install more things while trying to delete? Maybe it does a last apply before the destroy?)

Indeed the “delete” action does a re-install + destroy.
So you should see the cluster being destroy after everything is well created on kube side

1 Like

From your cluster delete logs, the terraform destroy is running so it is now in the destroy phase :crossed_fingers:

1 Like

Hum it seems stuck still :frowning:

No nodes are being created to run coredns