Force cron job halts other deployments?

PmegaOcalc · June 22, 2023, 10:43am

INFORMATION

https://console.qovery.com/organization/2119cfce-8e17-474a-9a13-d0e6f0c2794b/project/e5894769-84d5-46af-a9d0-1fd3727737c9/environment/06b27686-2a51-4231-a028-ef090cfba5f3/logs/84b36b8f-1ce3-4a0c-ad2f-25e9e7d872d9/live-logs

ISSUE

We set up a one off use cron job based on Cronjob | Docs | Qovery. It’s a long running process and after force running it it shows up as “Deploying” status on the Deployments tab and so on. We also have automatic deployments set up on PR merges but they don’t trigger a deployment now because the long running cron process is “deploying”.

Is this the intended functionality or are we missing something?

also you can see in the “environments”
staging is “cancelling”

but inside there’s

deploying and it shows “30 seconds ago” but it has been running for several hours and we haven’t had anything else deploy in that time.

bchastanier · June 22, 2023, 4:03pm

Hey @PmegaOcalc,

How long does this is supposed to last?

You are right in your understanding, in case of cron job, we deploy it and mark it as ok and it will be executed on schedule as a cron job by k8s.
BUT in the case you force run this cron job, it’s executed directly as a job and we are waiting for it to fail or finish.
In your case, this job has a max duration of 259 200 seconds, meaning we will wait for it to be done or failed for the given timeout. During this time, it’s env is considered in DEPLOYING state and no other services can be deployed in the same time.

From last deployment, it also seems your job is failing, can you confirm there is an error via its logs?

Cheers

prki · July 17, 2023, 11:52am

Hi @bchastanier, what’s the logic behind running “force run” differently? We would expect it to run in the same way as a typical cron job.

Because of the way it works now, we are forced to update the cron schedule with the exact date and time we want it to run since we cannot use “Force run” or it will block all deployments.

prki · September 5, 2023, 8:18am

@bchastanier Do you have a reason why “force run” behaves differently? Would you consider making them consistent with the regular run by cron?

a_carrano · September 5, 2023, 2:15pm

Hi @prki ,

the force run on a cronjob is supposed to launch just one execution of the job and not interfere with the cronjob scheduling.

From a technical point of view, when you trigger the force run: an engine instance launch the deployment of a job (based on the cronjob code) on your cluster and monitors it for the whole execution time. This is the main reason why we can’t let your cronjob “force run” run for too much long time.

Maybe from a technical pov, we could just trigger another job based on the cronjob in a fire and forget mode (i.e. without having our engine instance waiting for it to complete) but I’m not sure if we will be able to pull back the final execution status.

I’ll check

prki · September 5, 2023, 3:21pm

Kubernetes has a way to start a manual run from an already deployed cronjob, maybe it would make sense to use that for force runs.

In general, what we are looking for is one-off long-running jobs for an archiving/data migration use case. We tried to deploy it as a cron job and use force run when we needed to run it but that got our deployments stuck. Then we switched to updating the job schedule to something like 5min from now which works, but it’s not a great UX.

a_carrano · September 5, 2023, 4:13pm

Yes, with the --from=cronjob/<cronjob_name> option but there are no more safety measures that will prevent deploying hundreds of jobs and piling them on your infra without any easy way to stop/kill them (unless you connect to kube). Anyway, I’m adding this change to our backlog

prki · September 5, 2023, 5:37pm

Interesting point. Would it not be protected by cronjob.concurrency_policy Forbid option?

Regardless, more insight into running jobs in Qovery would be nice. We recently had a stuck cronjob we didn’t know about until we got your support team involved.

a_carrano · September 6, 2023, 7:17am

we are working on the front-end design and you will soon get the info on:

any ongoing cronjob execution (status, cpu/ram etc)
the last X completed execution
the last Y failed execution

X and Y can be controlled via the advanced settings (default is 1,1) but it comes with a cost (since we keep the pod history in kube)

Topic		Replies	Views
Scaleway - Deployment of lifecycle jobs and cron jobs doesn't work anymore Questions and Answers	7	36	August 19, 2024
Error when deploying: job failed to be executed in the given X minutes Deployment qovery	3	410	March 25, 2024
Configuring a cron to deploy an environment Deployment qovery	7	63	October 24, 2024
Deploy a crontab container from the CLI Deployment	14	511	March 17, 2023
Create scheduled tasks and consumer Questions and Answers	4	501	December 22, 2022

Force cron job halts other deployments?

Related topics