The Major Problem I’ve Faced When Using AWS Ecs to Deploy Microservices

I’ve been using ECS to deploy my microservices for many years. But this service is not ready for real-world projects.

With ECS, I can manipulate the autoscaling of my microservices, add a new version of each microservice, check the health check and more.

The problem comes when a single microservice becomes unstable. When it’s unable to start correctly. In this case, ECS can’t stabilize the situation.

When this occurs, ECS tries to start the new version. But as it’s unstable, the health check returns an error. Then ECS stops this service and starts a new one with the same version.

This seems the obvious workflow. ECS waits until I push a corrected version.

But here comes the problem.

I can’t update a service which isn’t stable.

When running the following command to update the service with the corrected version:

aws ecs update-service 
    --cluster my-cluster 
    --service my-microservice 
    --desired-count 2 
    --task-definition my-microservice-task:2 
    --deployment-configuration “{\”maximumPercent\”: 200, \”minimumHealthyPercent\”: 100}” 
    --force-new-deployment

This returns me an error. As the service I want to update is not stable.

But that’s the reason I want to update the service because it isn’t stable.

How to proceed?

I must first make the service stable. I must edit the service settings to have 0 instances running.

aws ecs update-service 
    --cluster my-cluster 
    --service my-microservice 
    --desired-count 0 
    --task-definition my-microservice-task:2 
    --deployment-configuration “{\”maximumPercent\”: 200, \”minimumHealthyPercent\”: 100}” 
    --force-new-deployment

After that, ECS will slowly deregister the running instances of the service my-microservice.

When I have 0 instances, ECS consider my service stable. This may take up to 5 minutes.

And then, I can update again with the new version.

aws ecs update-service 
    --cluster my-cluster 
    --service my-microservice 
    --desired-count 2 
    --task-definition my-microservice-task:2 
    --deployment-configuration “{\”maximumPercent\”: 200, \”minimumHealthyPercent\”: 100}” 
    --force-new-deployment

Now, AWS accepts my command. And ECS will deploy the new version which is stable.

How to know if a service is stable or not?

In the AWS console, go to the ECS service screen.

AWS ECS Listing Clusters

Then, select the cluster and go to the service in question.

AWS ECS Listing Services

Finally, go to the Deployments and Events tab.

Now, in the Events section, if the service is stable, the latest message should be “service xxx has reached a steady state.”

AWS ECS Deployments and Events tab

If this is not the case, maybe it’s still deploying and will be soonly stable.

Or it’s starting and stopping the services in a loop. This means that the problem is still present.

I didn’t find a way to update a corrected version automatically.

So, until now, I run those commands manually.

If someone knows how to do it automatically, I would appreciate a comment with the solution.

If you want to learn more about good quality code, make sure to follow me on Youtube.

My New ebook, How to Master Git With 20 Commands, is available now.

2 responses to “The Major Problem I’ve Faced When Using AWS Ecs to Deploy Microservices”

  1. I to have experienced this on Fargate a few times where lets say a unstable build has been pushed out and I have either had to set the desired count to 0 or stop all tasks and push a new build out quickly.

    Don’t believe I ever got to the bottom of it apart from making sure the dev teams built more stable services and put better monitoring in place to alert when tasks are continuously cycling.

    Liked by 1 person

    1. Monitoring AWS is an option, but I refuse to put in place a system to monitor a deployment system. I should put all the effort in the application, not in the deployment system.
      Hope we find a better solution soon.

      Like

Leave a comment

A WordPress.com Website.