We’re finding that we’re spending a lot of time waiting on preview environments to deploy.
Would anyone here have some general tips about how they’ve sped up preview environment deployments? Is anyone getting preview environments with multiple applications in them built from GitHub repos in under 5 mins, for example? How did you do it?
How many services do you have per environment? 11 services
What technology/language/framework are you using? NodeJS (based on Node16 Docker image), NestJS, Nginx
Which database(s)? 2 x Postgres db (as containers, not managed)
Do you use a CI/CD system? Which one? GitHub actions (but they are run in parallel - preview environments spin up straight away with a new commit - no waiting on the GitHub actions to execute)
Do you deploy exclusively from Docker or also from a container registry? We use the Git provider as the application source for all services.
I published this article explaining what improvements we will release in the coming weeks to speed up deployment. The most significant improvement is the parallel deployment, saving you a lot of time.
Qovery’s Deployment Pipeline feature allows for the parallel deployment, which can significantly reduce the time it takes to deploy your services. By executing multiple steps in parallel, you can speed up the deployment process and reduce the risk of downtime during deployment.
Improvement on your side
Now, even if the parallel deployment is available on Qovery and will drastically speed up your deployment time, you still need to look at your app build time. Maybe some app build times are taking longer than others, which could be improved.
Can you create a list of your 11 services and give the time for each to be built?
Can you provide the Dockerfile for the ones taking most of the total build time?
@rophilogene I’m trying to do the tests to get deployment times but now deploying Postgres container db services takes so long it keeps timing out, causing an error, and blocking the full deployment from happening. Here are some example logs showing this issue occurring: Qovery
Anything we can configure differently to make sure the database initialisation does not time out and block preview environments from being deployed? Thanks!
It seems a node was unhealthy in your cluster, making impossible to deploy the a database.
The node is out now, so your deployment should be working now
You need to use extra monitoring for that
But we haven’t done anything on our side, AWS took care of removing the node.
I just looked at the events and re-triggered the deployment of your environment
@will, As mentioned by @Erebe, I highly encourage you to use a monitoring solution like Datadog, Newrelic, or others. Otherwise, you will be blind if something wrong happens on your cluster.
But we haven’t done anything on our side, AWS took care of removing the node.
I just looked at the events and re-triggered the deployment of your environment
I just want to double check you didn’t do any extra step - was it really just re-triggering the failed deployment? We are seeing the same error again today, and I’ve been retriggering deployments all morning but the error keeps persisting. Here are deployment logs running right now: Qovery
Here are the deployment logs for a separate environment which just failed (same error - time out when creating a database container service): Qovery
This time there were 2 issues on your cluster.
The first one is that you reached a quota ‘RulesPerSecurityGroupLimitExceeded’ and we were not able to provision a new external IP for your public database.
For this, the solutions are either, increasing your AWS quotas (Increase security group rule quota in Amazon VPC | AWS re:Post) or reduce your usage of public database to avoid hitting the quota.
The second issue is that AWS had trouble to provision a new node for your cluster on the same region where your network volumes is provisioned. For this there is no solution beside re-trying from time to time, it is usually due because AWS lacks machine capacity for your instance type in the given region.
We gave up trying to use Preview Environments - could not even get a benchmark deployment performance test because there were new errors each deployment which required redeploying individual services, then redeploying the whole environment (but then it used cached build images, and existing nodes, so not a proper benchmark test). It all took too much time for the benefit it provided.
Instead we are going to establish several semi-permanent test environments linked to git branches.