Relationship between Vagrant, Docker, Chef and OpenStack (or similar products)?

I am a web developer, but I am also interested in a few administrative tasks. Hence, the new move from pure administration to dev-ops comes handy for me.

Anyway, I have some problems to put a few things into a relationship. Maybe there isn't any, so I wanted to ask for help to clarify.

Basically, what I want to put into relation is four types of software (from my understanding). The exact products don't matter, you can place any similar software as an alternative:

  • Vagrant: From my understanding is to automate creation and management of VMs: Setting them up, starting and stopping them. This can be done using a local VM or remote, e.g. on a cloud platform.
  • Docker: A "lightweight VM", based on a few Linux kernel concepts, which can be used to run processes in isolation, e.g. in a shared web hosting environment.
  • Chef: A tool to setup and configure an operating system, e.g. inside a VM.
  • OpenStack: A tool that allows you to build your own private cloud, hence comparable to something such as AWS.

Question #1: Are my explanations right, or am I wrong with some (or all) of these consumptions?

Question #2: How could I mix all those tools? Would that make any sense?

In my imagination and from my point of understanding, you could go and

  • use OpenStack to build your own cloud,
  • use Vagrant to manage the VMs run in the cloud,
  • use Chef to setup these VMs
  • and finally use Docker to run processes inside the VMs.

Is this correct? And if so, can you give me an advice in how to start using all this (it's quite a lot at the same time, and I don't know yet where to start)?

Let's use their respective web pages to find out what are all these projects about. I'll change the order in which you listed, though:

  • Chef: Chef is an automation platform that transforms infrastructure into code.

    This is a configuration management software. Most of them use the same paradigm: they allow you to define the state you want a machine to be, with regards to configuration files, software installed, users, groups and many other resource types. Most of them also provide functionality to push changes onto specific machines, a process usually called orchestration.

  • Vagrant: Create and configure lightweight, reproducible, and portable development environments.

    It provides a reproducible way to generate fully virtualized machines using either Oracle's VirtualBox or VMWare technology as providers. Vagrant can coordinate with a configuration management software to continue the process of installation where the operating system's installer finishes. This is known as provisioning.

  • Docker: An open source project to pack, ship and run any application as a lightweight container

    The functionality of this software somewhat overlaps with that of Vagrant, in which it provides the means to define operating systems installations, but greatly differs in the technology used for this purpose. Docker uses Linux containers, which are not virtual machines per se, but isolated processes running in isolated filesystems. Docker can also use a configuration management system to provision the containers.

  • OpenStack: Open source software for building private and public clouds.

    While it is true that OpenStack can be deployed on a single machine, such deployment is purely for proof-of-concept, probably not very functional due to resource constraints.

    The primary target for OpenStack installations are bare metal multi-node environments, where the different components can be used in dedicated hardware to achieve better results.

    A key functionality of OpenStack is its support for many virtualization technologies, from fully virtualized (VirtualBox, VMWare), to paravirtualized (KVM/Qemu) and also containers (LXC) and even User Mode Linux (UML).

I've tried to present these products as components of an specific architecture. From my point of view, it makes sense to first be able to define your needs with regards to the environment you need (Chef, Puppet, Ansible, ...), then be able to deploy it in a controlled fashion (Vagrant, Docker, ...) and finally scale it to global size if needs be.

How much of all this functionality you need should be defined in the scope of your project.

Also note I've over-simplified mostly all technical explanations. Please use the referenced links for detailed information.

I think coming from a developer background will make becoming a 'devops' actually more tricky, your question is almost 3 years old so It would be interesting to hear how you are finding the journey, I will give an answer from the point of view of the sys admin about the applications you mentioned above and hopefully it will shed some light, or give a non technical perspective that will go some way to explaining why a person (admin or dev) would begin considering exactly what you have asked e.g. from the devops perspective whats the relationship between x, y, z are these tools greater that the sum of their parts?

I actually think sys admins have the upper hand here, most of the applications you mention in your question solve admin 'problems' and in doing so provide a more abstract data center environment, and this in turn is more programmable for developers and the new 'devops' strategy (read strategy/team, devops is not a person). So what's the relationship with the apps you mention? how does this provide a holistic approach to the IT service?

OpenStack: A tool that allows you to build your own private cloud, hence comparable to something such as AWS

That's what it is, but what does it do? - the mostly aptly named operating system was D.O.S - it operated your disk by abstracting the BIOS, OpenStack operates your data center and abstracts your infrastructure (IaaS - is Jargon for data center operating system). Now your data center has an API, a command syntax and a GUI, OpenStack can drive hypervisors, switches, routers, firewalls, storage area networks, load balancers, docker hosts etc.. Openstack uses your hardware manufactures 'plugin' or the particular function can exist solely in software as software defined something or network function virtualization. On top of this OpenStack, and all other clouds, can orchestrate their own infrastructure by reading scripts you throw at the orchestration engine or are triggered based on rules (scale up, scale down etc.). So openstack is a giant layer of abstraction, e.g. I don't care what switch I have, give me a network with this command, or, build me a complicated load balanced, HA, publicly available, auto scaling, domain name registered, storage attached thingy - with this script I found on the internet.

Docker: A "lightweight VM", based on a few Linux kernel concepts, which can be used to run processes in isolation, e.g. in a shared web hosting environment.

Docker is another layer of abstraction and like cloud is a disruptive technology, it's changing the industry because it solves many operational 'problems' like software dependencies, upgrades, data isolation and sheer portability. Java became popular because of it's source code portability that developers didn't have to think about, a running JVM meant that their code should run on the coffee machine so long as it supported java. Docker solves a similar problem, to run my App you need a docker host, not, you need this version of python, this kernel, this linux distro and so on, the app still has those dependencies of course, but the underlying host doesn't care and the admin doesn't care what you do inside an isolated container (to a point). Docker is changing both the development and operations paradigm, treating an entire operating system and it's services like a binary. we can get them from a repository, version them, modify them, run them with parameters etc.

Chef: A tool to setup and configure an operating system, e.g. inside a VM.

Yes, and not as disruptive as the first two, Chef, puppet, ansible, salt, system center operations manager and a massive plethora of other applications in this space provide a way for developers and admins to model deployments, upgrades and other actions (config changes), there doesn't seem to be any standards body over looking these efforts like there is for cloud. But we're not dealing with something as definitive as Infrastructure so, it's more painful to learn these and not much is transferable from one to the other.

Vagrant: From my understanding is to automate creation and management of VMs: Setting them up, starting and stopping them. This can be done using a local VM or remote, e.g. on a cloud platform.

This is the odd one out in the list of apps you mention, Vagrant is a tool for developers and a toy for admins, you can quickly stand up a development environment with vagrant, e.g. I want to develop an android app, grab an IDE from vagrant, I think it will be overtaken by Docker soon.

can you give me an advice in how to start using all this (it's quite a lot at the same time, and I don't know yet where to start)?

This is why I think admins have the upper hand, we have had to do most of this manually and know what can go wrong, puppet manifests, cloud computing and docker orchestration will come easier to us, developers will find themselves taking many tangents so my advice to any potential devops is to be an admin first.

On my end, I'm using a combination of Vagrant and Docker only.

I use vagrant to provision the machines (there's additional cloud providers but I am using the built in VirtualBox. Because I am using this approach the external networking and storage is pretty much manual, but if you use something like the vagrant-aws plugin you can tell AWS to provision the necessary parts for you.

The provisioning script I use points to a secure location which contains CA certificate and keys used for signing CSRs along with the docker swarm join tokens. In addition I install docker-engine and configure it to join the swarm (initialize if there isn't any).

Once that is settled, I simply do a docker stack deploy from my local machine or build box to deploy the stack with everything I need.

In my case I just dropped chef in favor of just using simple post installation scripts that do yum or apt-get on as my provisioning scripts.

I also use the vagrant-triggers plugin to add additional scripting before destroy (in my case to leave the swarm).

The nice part of centralizing with Vagrant is you can replicate the environment on another system or single computer for development just have to add or change the provider section. Mind you I haven't gone through setting up OpenStack on a single computer to manage VirtualBox.

I just finished an OpenStack deployment project which uses a Chef server inside of a Vagrant instance: https://github.com/bluechiptek/bluechipstack/blob/master/README.md

The primary problem with doing it this way is getting the Vagrant instance the same IP each time you want to manage the nodes. If you do static addressing, it works well. Doing it via a VPN is less than ideal.

Not skilled enough to answer this completely, but your assessment of Vagrant and Chef seems to be correct. On my development box, I spin up VMs using Vagrant and then provision them with Chef and it works really well.