How do I deploy a docker container and associated data container, including contents?

I'll start by admitting I'm pretty new to Docker and I may be approaching this problem from the wrong set of assumptions... let me know if that's the case. I've seen lots of discussion of how Docker is useful for deployment but no examples of how that's actually done.

Here's the way I thought it would work:

  1. create the data container to hold some persistent data on machine A
  2. create the application container which uses volumes from the data container
  3. do some work, potentially changing the data in the data container
  4. stop the application container
  5. commit & tag the data container
  6. push the data container to a (private) repository
  7. pull & run the image from step 6 on machine B
  8. pick up where you left off on machine B

The key step here is step 5, which I thought would save the current state (including the contents of the file system). You could then push that state to a repository & pull it from somewhere else, giving you a new container that is essentially identical to the original.

But it doesn't seem to work that way. What I find is that either step 5 doesn't do what I think it does or step 7 (pulling & running the image) "resets" the container to it's initial state.

I've put together a set of three Docker images and containers to test this: a data container, a writer which writes a random string into a file in the data container every 30 s, and a reader which simply echoes the value in the data container file and exits.

Data container

Created with

docker run \
    --name datatest_data \
    -v /datafolder \
    myrepository:5000/datatest-data:latest

Dockerfile:

FROM ubuntu:trusty

# make the data folder
#
RUN mkdir /datafolder

# write something to the data file
#
RUN echo "no data here!" > /datafolder/data.txt

# expose the data folder
#
VOLUME /datafolder

Writer

Created with

docker run \
    --rm \
    --name datatest_write \
    --volumes-from datatest_data \
    myrepository:5000/datatest-write:latest

Dockerfile:

FROM ubuntu:trusty

# Add script
#
ADD run.sh /usr/local/sbin/run.sh
RUN chmod 755 /usr/local/sbin/*.sh

CMD ["/usr/local/sbin/run.sh"]

run.sh

#!/bin/bash

while :
do
    sleep 30s

    NEW_STRING=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)

    echo "$NEW_STRING" >> /datafolder/data.txt

    date >> /datafolder/data.txt

    echo "wrote '$NEW_STRING' to file"
done

This script writes a random string and the date/time to /datafolder/data.txt in the data container.

Reader

Created with

docker run \
    --rm \
    --name datatest_read \
    --volumes-from datatest_data \
    myrepository:5000/datatest-read:latest

Dockerfile:

FROM ubuntu:trusty

# Add scripts
ADD run.sh /run.sh
RUN chmod 0777 /run.sh

CMD ["/run.sh"]

run.sh:

#!/bin/bash

echo "reading..."

echo "-----"

cat /datafolder/data.txt

echo "-----"

When I build & run these containers, they run fine and work the way I expect:

Stop & Start on the development machine:

  1. create the data container
  2. run the writer
  3. run the reader immediately, see the "no data here!" message
  4. wait a while
  5. run the reader, see the random string
  6. stop the writer
  7. restart the writer
  8. run the reader, see the same random string

But committing & pushing do not do what I expect:

  1. create the data container
  2. run the writer
  3. run the reader immediately, see the "no data here!" message
  4. wait a while
  5. run the reader, see the random string
  6. stop the writer
  7. commit & tag the data container with docker commit datatest_data myrepository:5000/datatest-data:latest
  8. push to the repository
  9. delete all the containers & recreate them

At this point, I would expect to run the reader & see the same random string, since the data container has been committed, pushed to the repository, and then recreated from the same image in the repository. However, what I actually see is the "no data here!" message.

Can someone explain where I'm going wrong here? Or, alternatively, point me to an example of how deployment is done with Docker?

You got an assumption wrong about how volumes work in docker. I'll try to explain how volumes relates to docker containers and docker images and hopefully differences between data volumes and data volume containers will become clear.

First let's recall a few definitions

Docker images

Docker images are essentially a union filesystem + metadata. You can inspect the content of docker image union filesystem with the docker export command, and you can inspect a docker image metadata with the docker inspect command.

Data volumes

from the Docker user guide:

A data volume is a specially-designated directory within one or more containers that bypasses the Union File System to provide several useful features for persistent or shared data.

It is important to note here that a given volume (as the directory or file that contains data) is reusable only if it exists at least one docker container using it. Docker images don't have volumes, they only have metadata which eventually tells where volumes would be mounted on the union filesystem. Data volumes aren't either part of docker containers union filesystem, so where are they? under /var/lib/docker/volumes on the docker host (while containers are stored under /var/lib/docker/containers).

Data volume containers

That special type of container has nothing special. They are just stopped containers using a data volume with the sole and unique goal of having at least one container using that data volume. Remember, as soon as the last container (running or stopped) using a given data volume is deleted, that volume will become unreachable through the docker run --volumes-from option.

Working with data volume containers

How to create a data volume container

The image used to create a data volume container has no importance as such a container can remain stopped and still fill its purpose. So to create a data container named datatest_data for a volume in /datafolder you only need to run:

docker run --name datatest_data --volume /datafolder busybox true

Here base is the image name (a conveniently small one) and true is a command we provide just to avoid seeing the docker daemon complain about a missing command. Anyway after you have a stopped container named datatest_data with the sole purpose of allowing you to reach that volume with the --volumes-from option of the docker run command.

How to read from a data volume container

I know two ways of reading a data volume: the first one is through a container. If you cannot have a shell into an existing container to access that data volume, you can run a new container with the --volumes-from option for the sole purpose of reading that data.

For instance:

docker run --rm --volumes-from datatest_data busybox cat /datafolder/data.txt

The other way is to copy the volume from the /var/lib/docker/volumes folder. You can discover the name of the volume in that folder by inspecting the metadata of one of the container using the volume. See this answer for details.

Working with volumes (since Docker 1.9.0)

How to create a volume (since Docker 1.9.0)

Docker 1.9.0 introduced a new command docker volume which allows to create volumes :

docker volume create --name hello

How to read from a volume (since Docker 1.9.0)

Let say you created a volume named hello with docker volume create --name hello, you can mount it in a container with the -v option :

docker run -v hello:/data busybox ls /data

About committing & pushing containers

It should now be clear that since data volumes aren't part of a container (the union filesystem), committing a container to produce a new docker image won't persist any data that would be in a data volume.

Making backups of data volumes

The docker user guide has a nice article about making backups of data volumes.


Good article reagarding volumes: http://container42.com/2014/11/03/docker-indepth-volumes/

You could also use a docker data container to deploy code

I don't know if it's a good practice, but i do it like that :

FROM ubuntu:trusty

# make the data folder
#
RUN mkdir /data-image

# in my case, I have a 
# ADD dest.tar /data-image/
#
# but to follow your example :
# write something to the data file
RUN echo "no data here!" > /data-image/data.txt

# expose the data folder 
#
VOLUME /datafolder

ENTRYPOINT cp -r /data-image/* /datafolder/

You can now push your image and use volumes-from , etc ...