Traefik

Web server fitting nicely with docker

Posted by Christoph Eckerle, Philipp Höfflin, Drilon Konjufca, Rainer Michel on March 26, 2018 in Dev tagged with Docker, Devops, Development
  1. Introduction
  2. Basics
    1. Source and Features
    2. Central Concepts
  3. Traefik for a multi-dev environment
    1. Use Case
    2. General Network Setup
    3. Configuration
  4. Traefik as a gateway for data integration pipelines
    1. Use Case
    2. Security
    3. Configuration hot swapping
    4. Load Balancing
    5. Configuration
  5. Downsides

Introduction

You may have experienced this already, you want to implement a solution that’s supported by an external tool and at the first glance it seems easy. As you begin to adapt the tool to your requirements its true nature is revealed: Some tools carry high adoption costs, some are surprisingly easy to use and other are good fun to work with. In this blog post we talk about our experience of implementing the open source http reverse proxy ‘Traefik’ - which in this case was just good fun!

Now you may ask, what makes Traefik so special? Firstly, Traefik fits nicely in dockerized environments, because it runs natively on Docker. Being centralized means it is easy to expose any http service, add basic authentication and handle SSL. Furthermore due to hot swapping of services no downtime is needed for configuration changes. This offers a great advantage over other popular reverse proxies such as Nginx. Traefik is a good fit for dynamic and service orientated environments.

By reading this article you will learn the basic concepts of Traefik. You will gain experience of how to setup a network to bridge between docker containers and frontends in a docker swarm cluster. Additionally you will get hints on how you can secure your environment, configure Traefik for hot swapping and load balancing within the use case of Traefik as a gateway for data pipelines. Nothing in IT is ever perfect and at the end of this blog we share some of the downsides we experienced with Traefik.

Basics

Source and Features

Traefik is a natively dockerized reverse proxy and load balancer to shield backend services from frontends by acting as a single point of entry. You can get Traefik from github. It is written in Go and deployed as a single binary. Traefik supports circuit breakers, round robin patterns, websocket and http/s, access logs e.g. in JSON-format, integration to metric-apps like Prometheus and the open certificate authority ‘let`s encrypt’. Traefik has an AngularJS Web UI to view the current state of health, http errors and routing rule sets during runtime. Other natively supported backends include Swarm, Kuberenetes, Mesos, Consul and AWS ECS, just to name a few of the more popular ones. Additionally there is a general REST API.

Central Concepts

Once you have the Traefik binary from github, you can start the Docker image via docker run. The next step is to configure Traefik by editing a docker-compose.yml, which you will see in more details in the following sections. But before we talk about how we used Traefik for our specific use cases, let`s introduce the central concepts of Traefik:

  • Entrypoints are the network entrypoints for Traefik by using a port, SSL certificates, keys or redirections to other http/s entrypoints. The following sample shows a https entrypoint exposed on port 443 and additional TLS configuration with linked certificate and key files:
          [entryPoints]
              [entryPoints.https]
                address = ":443"
              [entryPoints.https.tls]
                [[entryPoints.https.tls.certificates]]
                 certFile = "tests/traefik.crt"
                 keyFile = "tests/traefik.key"
    ...
    
  • Frontends are rule sets which define how to handle incoming requests from the entrypoint. The rule sets by itself are grouped by Modifiers and Matchers. Modifiers are only used to modify the request, e.g. by adding a path prefix. Matchers check patterns to forward to a defined backend, e.g by http headers.
         [frontends.frontend1.routes.test_1]
               rule = "Host:test1.localhost;Path:/test"
          [frontends.frontend2.headers.customresponseheaders]
             X-Custom-Response-Header = "True"
    ...
    
  • Backends are responsible for load-balancing the forwarded traffic from frontends by defined rule sets, e.g. based on a max connection limit. In this following example the 16th request will get a 429 ‘too many requests’ from Traefik.
          [backends]
            [backends.backend1]
              [backends.backend1.maxconn]
                 amount = 15
                 client.ip = "request.host"
       ...
    

Behind these three main concepts are further sub-concepts. Before we can move on to discussing some actual use cases we need to introduce one more concept, namely Labels:

  • Labels in the Traefik context are linked to Docker labels. With labels you can add metadata to docker objects like containers or services. Thanks to these labels we can configure Traefik to create internal routing rule sets. In the example below we use container and service labels to define an application service which is listening on port 9000.

          - "traefik.backend="my-coolest-app"
          - "traefik.docker.network=web"
          - "traefik.frontend.rule=Host:app.my-coolest-app.io"
          - "traefik.enable=true"
          - "traefik.port=9000"
     ...
    

To get hands-on with the Traefik basics you can find tutorials on katacoda.com

Traefik for a multi-dev environment

Use Case

We want to run several instances of an application and an ActiveMQ broker on a single virtual machine. Each application instance consists of an application server (Tomcat) and a relational database (MySQL). We want to assign different (sub-)domain names to the application server instances and to the web UI of ActiveMQ. Without Traefik we have to address the following problems:

  • The application servers need different ports. The standard port :443 can only be used for one instance.
  • A solution to map the different domain names is needed, e.g. configuring a DNS server.
  • A web server needs to be installed to handle TLS, as we do not want this handled by Tomcat

We found that these use cases could be addressed easily with Traefik.

Overview of Development Environment

General Network Setup

The application instances are set up as docker containers that communicate over a default bridge network. You can find those networks as 10.01.0/24 and 10.0.2.0/24 depicted below. They allow communication between Tomcat and the MySQL servers.

Both Tomcats provide their default port :8080 for HTTP(S) communication. Instead of exposing those ports directly we create a separate overlay network (10.0.0.0/24). Traefik uses this network to manage the traffic to the application servers. The same holds for ActiveMQ and its web UI that is provided on port :8161.

Traefik can be configured to provide access to all HTTP(S) endpoints through port :443 as a central access point for all clients and manage virtual hosts. The application servers become virtual hosts. Traefik routes HTTP requests to jms.mydomain.com:443 to activemq running on 10.0.0.64:8161 and those to 01-app.mydomain.com:443 to 10.0.0.46:8080 respectively.

As you can expect from a web server acting as a web proxy, Traefik can terminate the TLS connection and pass the requests as HTTP requests to the application servers.

Network Layout

Configuration

We used docker swarm to set up the docker services. Traefik is configured using the following docker-compose file:

version: '3.4'

services:
  traefik:
    command: >
      --docker
      --docker.watch
      --docker.swarmmode
      --docker.domain=mydomain.com
      --docker.exposedbydefault=false
      --docker.trace
    networks:
      - frontend
    ports:
      - "443:443"
...

This section shows how Traefik can be used to manage access to the services running on the network frontend.

This network has been set up as an overlay network:

docker network create --driver overlay traefik_ingress

We introduce a reference to this network as frontend in the docker-compose file for Traefik (and the services respectively):

networks:
  frontend:
    external:
      name: traefik_ingress

Traefik will automatically manage new services that connect to the frontend network. The sample docker-compose file for the application server illustrates this:

services:
  app:
    networks:
      - frontend
      - backend
    deploy:
      labels:
        - "traefik.enable=true"
        - "traefik.port=8080"
        - "traefik.docker.network=traefik_ingress"
        - "traefik.frontend.rule=Host: 01-app.mydomain.com"

We connect the application server to two networks: frontend and backend. Backend is the bridge network automatically set up, whereas frontend is again our overlay network traefik_ingress. We connect the application service app to Traefik using following service labels:

traefik.enable tell Traefik to manage access to this service
traefik.port the service is available on port 8080
traefik.docker.network the managed network
traefik.frontend.rule the virtual host. All requests to this host are directed to the given service.

And as a second example, below is the docker-compose file for ActiveMQ:

services:
  activemq:
    networks:
      - frontend
    deploy:
      labels:
        - "traefik.enable=true"
        - "traefik.port=8161"
        - "traefik.docker.network=traefik_ingress"
        - "traefik.frontend.rule=Host: jms.mydomain.com"

Traefik as a gateway for data integration pipelines

Use Case

For data integration of various systems we create data pipelines based on Apache Camel and Spring Boot and deploy them as Docker containers. Apache Camel is an open source integration framework based on the concepts of Enterprise Integration Patterns. These are Java applications which hold the processing logic and make sure the data is transformed and transferred to the desired endpoint destinations.

In the past, each data pipeline service was deployed on a different port. With the number of data pipelines growing with time, the environment became increasingly difficult to manage. We wanted something that would allow faster development of applications and shield the underlying data pipelines. Moreover not having to publish applications in different ports anymore, but rather use a gateway concept and APIs in combination like:
https://server/application/endpoint

Container Layout

Security

Traefik was a good choice to use as a gateway to shield the underlying data pipelines, while handling the HTTPS termination. Now the data pipelines only have the minimal code necessary to work, and the routing and security logic is in Traefik. Currently only the HTTPS port is opened for Traefik and the containers are not published through ports. As a result the only way to access a container is through the gateway.

Configuration hot swapping

Our first choice for a gateway was Nginx. It offered very good routing possibilities and is quite well known and documented. By deploying a new container, the Nginx configuration needed to be modified to route to the newly added container. The data pipelines are services which retrieve and transfer millions of records per hour and they would need to be forcibly stopped while the Nginx container is restarted with the new configuration.

I have always been intrigued by the concept of hot-swapping of server hard drives. You can just replace one while the system is running, after some time it will re-balance all the information without stopping the server. In software this should have been easier.

Using Traefik these issues are solved. The new container is deployed in the server and it is automatically discovered and the messages are routed accordingly from outside to the container. While Netflix Zuul in combination with Eureka offers a good alternative, they have a bigger memory footprint and no UI.

Load Balancing

For data pipelines the concept of load balancing is more difficult because the data is usually handled in a bulk manner and not per row basis. For the web application on the other hand is very easily to manage the load balancing in Traefik. The default technique was round robin, where the messages get forwarded in a cycle. If one of the containers stops, the requests will still be forwarded to the remaining ones.

Configuration

The configuration is similar to that of the above Use Case. The main difference being the use of the routing rule Headers:x-api-key. If the request includes a hidden API key it can pass through Traefik to the defined container. In this way we are using Traefik for simple API management.

services:
  datapipeline:
    labels:
      - "traefik.enable=true"
      - "traefik.backend=datapipeline"
      - "traefik.frontend.rule=Headers:x-api-key,${TRAEFIK_API_KEY}"

One of the routing rules we often use is PathPrefixStrip, this scans the URL of the request, removes the defined tag and passes it through to the container.

 - "traefik.frontend.rule=PathPrefixStrip:/datapipeline/"

The administration configuration relies on the traefik.toml file. We redirect all incoming http requests to https.

[entryPoints]
  [entryPoints.http]
    address = ":80"
      [entryPoints.http.redirect]
        regex = "^http://(.*)"
        replacement = "https://$1"

Basic authentication is used for the admin UI with this configuration.

[web]
  address = ":8080"
[web.auth.basic]
  users = ["admin:$apr1$F56akSHi$RgHA.h1ijgBSJmvvKwicu0"]

Downsides

Of course Traefik also has some downsides. It was built as simple as possible and does not include some of the richer features we need. Three important features not currently available are:

  • Support for JWT (JSON Web Tokens)
  • Not handling the security header x-api-key differently
  • Supporting only HTTP and not generic TCP routing

Currently you need to implement the JWT logic in each application or have a JWT service that would validate the tokens. Some non-official github branches have already implemented this and some are building the application from source and injecting this functionality.

The security header x-api-key is usually used for passing a secret key. In the admin UI this key can only be viewed by the administrator, still it is a security issue. For simple services that requires only one api-key when using Traefik. Handling multiple api-keys is more difficult and probably not meant to be used as an API gateway.

Since the routing is done based on HTTP headers, forwarding TCP traffic is not supported. We also deploy databases in Docker and they cannot be handled with Traefik, therefore the only way is to use a dedicated port.