The Big Infra
While working on Consul and Vault and discovering their features, I began to think I had enough tools to build a complete infrastructure that will follow my principles and help me to reach my goals (presented here).
So I decided to build one: let’s have a look
Of course, this demo is only a demo and is probably not well-suited for real use-cases.
I want to build an infrastructure that will serve a simple service: an echo TCP server based on socat (here is a simple Ansible playbook).
Here are the constraints for the infrastructure:
- local redundancy: multiple servers in the same datacenter
- geo redundancy: multiple datacenters
- SSL communications (everywhere, everytime, as soon as possible)
- SSL certs and secrets have TTLs (3 days)
- per-host unique accesses to secrets
- only necessary accesses are granted
autonomous and self-healing:
- services are discovered automatically
- failure are detected automatically and flows are routed on other parts of the infrastructure
- SSL certs and secrets are updated automatically
- immutability? as much as possible
Here are the constraints for the deployment:
- no manual actions: everything must be done with the deployment tools I will use
- SSL communications
- SSHFP when possible
- SSH fingerprints
- no secrets in configurations
- split of responsibilities
- clean and reusable bricks
- relatively fast (I will be using AWS EC2 instances)
My initial plans was to also use the Auto-scalling feature from AWS.
But I was not able to find a way to create, in a secure manner, per-host unique accesses to Vault, using this method.
So the infrastructure will not scale automatically.
Here is the list of the different tools I used:
Service Discovery: Consul 1.6
Secrets Management: Vault 1.1.5
I only have access to the free editions of Consul and Vault.
This projects gave me the opportunity to work with a lots of tools and I’ve learned and build a lot.
It also gives me the opportunity to open and participate in tickets and PRs, in the Consul and in some Terraform providers repositories (for example: Consul #5602, Consul #6192, Consul #6284, Terraform Provider Cloudflare #428).
Here is a global view of the infrastructure:
The picture does not show some important parts of the project:
- the security groups and routing tables that only allow the strictly necessary communications
- the renegociations and regenerations of SSL certs and Vault tokens
- the deployment method !
If I could find some time, I really would like to work on those features:
- upgrade to Ansible 2.8
- some Ansible roles should be transformed into Ansible python modules as they are only API calls
- finish/fix/update/correct the Ansible documentations because it’s a total mess for now
- some parts of the deployment should be done in Terraform but for now the Consul and Vault providers do not have them
- upgrade to Debian 10 Buster
- add Ansible tests with Molecule
- add monitoring (Grafana, Prometheus)
- try Mitogen
3, 2, 1, Let’s jam!
Do you want more information, do you want to try this demo yourself? go!
This will probably take (in cumulative time): 2 hours for the preparations and 4 hours for the deployment.
But there will be a lot of coffee breaks
Here are the chapters:
- getting started
- the working environment
- configure your infrastructure
- the system AMIs
- the bastion host
- the Consul/Vault clusters
- the echo service AMIs
- the echo infrastructure
- blue/green deployment
- annexe 1 - approles
- annexe 2 - tfstates
What you need before starting
What you need for your working environment
How to customize the Terraform and Ansible configuration files for your infrastructure
How to create the base Debian system AMIs
How to create the bastion network and the bastion host
How to create the Consul and Vault clusters
How to create the 'echo' services AMIs
How to create the 'echo' infrastructure
How to deploy (blue/green) an upgrade of the 'echo' service
How to clean everything
How to prepare and provision a host
How to store the Terraform states in Consul