Build Docker Image on Gitlab. The best approach to build docker-in-docker. And use cache to improve performance. Also, let's leran how to push the images to the AWS ECR.

Build Docker Image on Gitlab [without dind and with AWS ECR]

Build Docker image on Gitlab, it’s an easy task, but it could get complicated in complex scenarios, and some tricks to use cache correctly. But, I will show you how to generate a Docker Image for your pipeline on Gitlab.

GitLab is a well-known DevOps platform presented as a single application. It also makes GitLab unique and produces a smooth software workflow, opening your business from the limitations of a pieced-together toolchain. 

So, Gitlab provides us with a remarkable ability to scale our build process faster by using Runners. The Runner could be any computer running a Daemon service that connects to Gitlab. It means that it could be a Virtual Machine, Docker container.

What it’s the Goal of the Runner?

Perform all tasks that the user triggers on Gitlab. And Gitlab only orchestrates the Runners. Also, you can have more than one Runner for more than one Gitlab.

The most significant advantage of Runner is that we can run Docker to build our application. It also is fantastic because it allows us to run our builds quickly without installing the dependencies directly on the Runner server.

To register one Runner to Gitlab, we need to specify which Docker Image will be the default one to run your builds inside of it.

You may be thinking, why are you explaining it to me? Well, probably when you register the Runner, you may have specified a generic Docker image like Alpine, without any extra application, as described in the official documentation. 

So, that means that, at some point, we will need to create our own images with dependencies that it’s required to build one specific application. And, we don’t want to manually create this image from outside Gitlab and then use it inside the Gitlab to build our application.

When should we create our own Docker Image to use on Gitlab?

It’s up to up you or the application requirements. For example, a simple Java application that uses Maven doesn’t require your own image. Instead, you can go through the Docker Hub, grab the official Maven image, and use it for your project. See this Java pipeline on Gitlab as an example.

But, If our project needs more than one dependency? For example, if the Java projects also have some Typescript files that need to be compiled to Javascript, we need to install and use npm to compile it.

Here we have two alternatives: 

First, we can split our pipeline into two stages. One compiles the Java code, and a second stage compiles the Typescript code using a different Docker image, which is already available on Docker Hub.

The second alternative, you can configure your Maven project (pom.xml) to build the Typescript code during the build phase. (This is possible using the plugin frontend-maven-plugin)

But, if your application has unique requirements that need special dependencies and specific configurations, that need to be dynamically changed for which build?

So, in this scenario, it’s the one that you will need to build your own image. And the good news, is that we can create a project or stage inside your existing project to make this Docker image. And later, use that new image in the next stage. You may be thinking if we build the docker image every time to process the next stage, it will be a slow process.

However, this is not especially true. The first time the stage runs will definitely take time, but the runner will cache this image on the docker registry once you have built the image. So, the next time you trigger the same pipeline, you will use the same image, and the stage that builds the image will be processed very quickly. Also, we don’t need to worry about where to store the image because this image is being created on the fly if it doesn’t exist yet.

Also, there are scenarios where our project goal is to create a Docker image that doesn’t require code compilation. For example, to make our own Tomcat that we will deploy later on a Kubernetes cluster. So, in this case, we need to push the image to a remote docker registry, for example, the Docker Hub or ECR (Elastic Container Registry) from AWS.

Simple example:

variables:
  REGISTRY: "XXXXXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/tomcat"
  FINAL_NAME_TOMCAT: ${REGISTRY}/api:tomcat-$CI_COMMIT_REF_NAME-$CI_JOB_ID
  
build-tomcat:
  image: docker
  stage: build
  script:
    - docker build -t $FINAL_NAME_TOMCAT .

Build Docker Image on Gitlab: Resolving Common Issues

The example above works very well, and it’s very straightforward. However, there is a common issue that you may be faced if you see one error message like this:

"Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"

Why does this error happen? How to resolve it?

 When we install Docker on our computer or server, we can use it as a client or configure it to share the service with another client. So, in other words, technically, we don’t need to run the daemon service on a specific computer to use Docker so that we can use it as a client. 

So we need a specific remote host where the daemon service is running. In this way, when you execute an operation on Docker will run remotely on another server. Also, it’s important to highlight that this is the default behavior when you install Docker. And to use or start the Docker daemon service, we need to enable it manually, so behind the scene, the daemon will open port 2375 or 2376 by default. And you can use the IP and that port on another computer also with Docker installed.

How to specify a remote Docker host? And then use it?

The Docker provides us the DOCKER_HOST variable, where we can specify which HOST and PORT. So, you can export this variable like this:

export DOCKER_HOST=tcp://server.www.bitslovers.com:2376

After defining that environment variable, you will use a remote daemon service from another Host/Server.

But, wait.

Let’s remember that behind the scene, the build process happens inside the Runner server. Also, on that server, we have the docker daemon running.

In this scenario, it will be better to have another approach to specify our remote docker daemon service without explicitly defining it on our .gitlab-ci.yml file.

And the good news is that we have a better approach!

Build Docker Image on Gitlab: Using Docker Sock

Docker also provides a second alternative, where we can specify a file/socket to communicate with a remote daemon service. For instance, if you have the daemon service running, by default, the Docker will create this file /var/run/docker.sock.

So, to use that file, we need to specify the DOCKER_HOST, on our .gitlab-ci.yml file, like this:

DOCKER_HOST: "unix:///var/run/docker.sock"

But if you try to run it again, you will get the same error message:

"Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"

or 

error during connect: Post "http://docker:2375/v1.24/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&shmsize=0&target=&ulimits=null&version=1": dial tcp: lookup docker on 10.100.0.6:53: no such host

This has not resolved our problem. Why?

Let me explain:

We are building a Docker image inside another Docker container. And this file is available on the Runner server. So, this file is not reachable from inside the Container where we effectively run the build process.

So, to resolve that issue, we can share that file between the Runner and the Container.

We need to register the Runner again. So, I recommend unregistering the Runner that you already have to avoid your build getting the wrong configuration.

gitlab-runner register -r YOUR_TOKEN -name BITSLOVERS_RUNNER -u gitlab.example.com -n --executor docker --docker-image alpine --docker-volumes '/var/run/docker.sock:/var/run/docker.sock' 

Replace the values according to your configuration:

 1 – YOUR_TOKEN

 2 – gitlab.example.com

This approach is the best because we don’t need to specify the DOCKER_HOST anymore for any pipeline/project on Gitlab. And because we declare the docker volume when we register the Runner, the volumes are automatically mounted and shared with all docker containers running inside the Runner. Also, the default value of DOCKER_HOST is /var/run/docker.sock, so we are okay to remove it from our script.

Also, this approach will speed up our process because we are efficiently using the cache. When you build a Docker image inside another container with this approach, we build and store this Docker Image on the Docker Registry from the Runner server. So, the next time we run the same pipeline, The pipeline will use the cache faster. And also, regardless of whether we are running and creating a new container for each pipeline to run our build process and build the image, the cache will be the same.

Build Docker Image on Gitlab using cache on Runner server. The best approach to build docker-in-docker.
Build Docker Image on Gitlab using cache on Runner server

Because we are building docker-in-docker.

In this example, we are building the image only. And we are not pushing the image externally to a remote Docker Registry. 

To push the image to any Docker Registry, we need to log in before pushing any image. And the login process it’s different for each Docker Registry. So, for example, the process it’s different to log in on the Docker Hub and AWS ECR.

Note: If you are running your Runner using AWS Fargate, there is no way to use cache for your Docker Images if you always create a new Runner for each build process.

Pushing Image to AWS ECR

Let’s how we can log in and push images on AWS ECR.

You need to perform the command below always before executing “docker push.”

$(aws ecr get-login --no-include-email --region us-east-1)

Just make sure that you have the right AWS IAM Role configured on the EC2 instance (The Runner). Alternatively, you can use Access Key and Secret Key, but AWS does not recommend it due to security concerns.

Also, the code above generates a token to log in, and the session is valid for 12 hours. So, consistently execute this command before the push to guarantee. 

So, in our example:

variables:
  REGISTRY: "XXXXXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/tomcat"
  FINAL_NAME_TOMCAT: ${REGISTRY}/api:tomcat-$CI_COMMIT_REF_NAME-$CI_JOB_ID
  
build-tomcat:
  image: docker
  stage: build
  script:
    - docker build -t $FINAL_NAME_TOMCAT .
    - $(aws ecr get-login --no-include-email --region us-east-1)
    - docker push 

Sometimes, we need to build a docker image to process a second task/stage. For example, prepare one docker image that will compile an application. In this scenario, we can build the image and use it in the next stage without pushing that image to a remote Docker Registry. 

Tip: Always use Gitlab Variables[see our complete guide] on your code to improve maintainability.

For instance, let’s see how to do it:

variables:
  AWS_REGION: us-east-1
  STACK_NAME: bitslovers-service
stages:
  - prep
  - deploy
Prep:
  image: docker:latest
  stage: prep
  script:
    - docker build -t build-container .
Deploy:
  image:
    name: build-container:latest
  stage: deploy
  script:
    - pip install awscli
    - pip install aws-sam-cli
    - aws configure set region ${AWS_REGION}
    - sam deploy --template-file template.yml --stack-name $STACK_NAME

We build a Docker image named “build-container” on the “prep” stage in this example. In this image, we have some dependencies that we need to use in the next phase (deploy stage) of the build process.

So, we don’t need to worry about storing that image. Instead, we build on-demand and use it. The image build-container:latest will be stored on the Docker Registry from the Runner, so the next time you execute that build, it will be faster because the image already exists.

Also, there is a second alternative to build the Docker image on Gitlab. We can use the “docker:dind” image to construct the images. However, this approach doesn’t use cache efficiently.

Let’s see one example:

variables:
  DOCKER_REGISTRY: XXXXXXXX.dkr.ecr.us-east-1.amazonaws.com
  AWS_DEFAULT_REGION: us-east-1
  REPOSITORY_NAME: bitslovers-app
  DOCKER_HOST: tcp://docker:2375
publish:
  image: 
    name: amazon/aws-cli
    entrypoint: [""]
  services:
    - docker:dind
  before_script:
    - amazon-linux-extras install docker
    - aws --version
    - docker --version
  script:
    - docker build -t $DOCKER_REGISTRY/$REPOSITORY_NAME:$CI_PIPELINE_IID . 
    - aws ecr get-login-password | docker login --username AWS --password-stdin $DOCKER_REGISTRY
    - docker push $DOCKER_REGISTRY/$REPOSITORY_NAME:$CI_PIPELINE_IID

One advantage of this approach is that we don’t need to manage the space on the server frequently because the images aren’t stored on the registry from the Runner server. So, after the pipeline is finished, the image and cache are deleted.

Conclusion

Building Docker image on Gitlab CI, it’s an easy task, but it requires some tricks to take advantage of all features. Personally, I like to use the Socks file, not the dind service (from “docker:bind”), because it’s slow to start the service, and we lose the cache for the image’s layers on a cache because we are creating a docker registry for every single interaction with the pipeline.

How to build a Javascript project on Gitlab.

How to execute Cloud Formation from Gitlab.

How to deploy Elastic Beanstalk using CI/CD.

Enjoy!

4 thoughts on “Build Docker Image on Gitlab [without dind and with AWS ECR]”

Leave a Comment

Your email address will not be published. Required fields are marked *

Free PDF with a useful Mind Map that illustrates everything you should know about AWS VPC in a single view.