Using Docker with the CMS open data
Overview
Teaching: Self-guided min
Exercises: 40 minQuestions
How do I use docker to effectively interface with the CMS open data?
Objectives
Download the CMS open data docker image
Open your own CMS open data container and check that graphical windows open
Restart the same container
Copy files into or out of the container
Delete and rebuild containers
Share a local directory from your computer to the container (pass a volume)
Overview
This exercise will walk you through setting up and familiarizing yourself with Docker, so that you can effectively use it to interface with the CMS open data. It is not meant to completely cover containers and everything you can do with Docker. Reach out to the organizers using the dedicated Mattermost channel if we are missing something.
Some guidance can be found on the Open Data Portal introduction to Docker. However, the use of graphical interfaces, such the graphics window from ROOT, depends on the operating system of your computer. Therefore, in the following, separate instructions are given for Windows WSL, Linux and MacOS.
Download the docker image for CMS open data and start a container
The first time you start a container, a docker image file gets downloaded from an image registry. The CMS open data image is large and it may take some time to download, even as long as 20-30 minutes, depending on the speed of your internet connection. After the download, a container created from that image starts. The download needs to be done only once. Afterwards, when starting a container, it will find the downloaded image on your computer, and it will be much faster.
Please follow the instructions below, depending on the operating system you are using.
We will use the docker run
command to create the container (downloading the appropriate image if it is the first time) and start it right away.
docker run -it --name my_od --net=host --env="DISPLAY" -v $HOME/.Xauthority:/home/cmsusr/.Xauthority:rw cmsopendata/cmssw_5_3_32:latest /bin/bash
Setting up CMSSW_5_3_32
CMSSW should now be available.
[21:53:43] cmsusr@docker-desktop ~/CMSSW_5_3_32/src $
This is now a bash shell in the CMS open data environment in which you have access to a complete CMS software release that is appropriate for interfacing with the 2011 and 2012 7 and 8 TeV datasets.
As there are rate limits for pulls from Docker Hub, you may get the following error message: docker: Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading.
. In that case, try later (the limit is per 6 hours) or use the mirror gitlab-registry.cern.ch/cms-cloud/cmssw-docker-opendata/cmssw_5_3_32
instead of cmsopendata/cmssw_5_3_32
.
Now let’s understand the options that were used for the docker run
command.
- First, the
-it
(or-i
) option means to start the container in interactive mode. Essentially, it means that you will end up inside the running container. - We assign a name to the container using the
--name
switch, so that we can refer back to this environment and still access any files we created in there. You can, of course, choose a different name thanmy_od
. - The
--net=host
switch will allow you to use the host network (Internet access) in the container. - The
--env
switch will forward the appropiateDISPLAY
environmental variable from the host machine to the container so X11-forwarding (the ability to open graphical windows inside the container) can be achieved. - For X11-forwarding to be functional, your local
$HOME/.Xauthority
file needs to be mounted as the/home/cmsusr/.Xauthority
file inside the container. We do this using the--volume
(or-v
) switch. Note that the colon (:
) symbol separates the source and destination points for the mounting procedure. In addition, therw
tag is given (aslo separated by:
) so it can be read and written if necessary. Optionally, you could mount any directory from your local machine to the container using the-v
option. This is sometimes useful; for instance, by adding-v /home/joe/playground:/playground
to the command line, theplayground
area can be mounted on the container and serve as a shared area between your local machine and the container. You will check out an example below. cmsopendata/cmssw_5_3_32:latest
is the name (and:version
) of the image we will use. If no label is prepended, Docker assumes that it resides in Docker Hub, the official image repository of Docker.- Finally, the
/bin/bash
option will throw the container into abash
shell when running interactively.
For a more complete listing of options, see the official Docker documentation on the docker run
command.
To test that X11-forwarding works, start the ROOT program by typing root
in the container prompt. In ROOT prompts , type TBrowser t
to open the ROOT graphical window. If the graphical window opens you are all set and you can exit from ROOT either by choosing the option from the TBrowser window or by typing .q
in the ROOT prompt.
Make sure that you can copy instructions from a browser page to the container terminal. One thing you can try is Shift+Ctrl+V
when pasting into your container terminal, rather than Ctrl-V
. That sometimes will work. If not, you will see later in these instructions how to pass files from your local computer to the container.
Then type exit
to leave the container.
If you find that X11 forwarding is not working and the ROOT graphical window does not open, try typing the following before starting your Docker container.
xhost local:root
If everything works fine, you are ready to continue with the lesson.
If you still have problems with X11 forwarding
Only in the case you are having problems with X11 forwarding, there is the option to create a container with an image with a VNC application installed
cmsopendata/cmssw_5_3_32_vnc:latest
:docker run -it --name my_od -P -p 5901:5901 cmsopendata/cmssw_5_3_32_vnc:latest /bin/bash
This application allows opening graphical windows on a remote machine (seen from the container, your own computer is a remote machine). Start the application with
start_vnc
from your container prompt, and choose a password. You will need to start it every time you use the container (if you want to open graphics windows), but you will define the password only at the first time.~/CMSSW_5_3_32/src $ start_vnc
You will require a password to access your desktops. Password: Verify: xauth: file /home/cmsusr/.Xauthority does not exist New 'myvnc:1' desktop is e0ca768960bf:1 Starting applications specified in /home/cmsusr/.vnc/xstartup Log file is /home/cmsusr/.vnc/e0ca768960bf:1.log VNC connection points: VNC viewer address: 127.0.0.1:5901 OSX built-in VNC viewer command: open vnc://127.0.0.1:5901 To kill the vncserver enter 'vncserver -kill :1'
When you do this the first time, download a VNC viewer to your local machine from, e.g., TigerVNC. You can then access the GUI in TigerVNC Viewer with the address given in the startup message with the the password you’ve chosen. It opens with an xterminal of your container. To test, start ROOT by typing
root
in the container terminal prompt. In the ROOT prompt, typeTBrowser t
to open the ROOT graphical window. If the graphical window opens you are all set and you can exit from ROOT either by choosing the “Quit Root” option from Browser menu of the TBrowser window or by typing.q
in the ROOT prompt.You can copy from the VNC Viewer terminal by selecting with the mouse, and paste to it by a middle mouse button click. If you are using a touchpad, you may need to define “middle mouse button” in Settings -> Devices -> Touchpad. You can set it to a three-finger tap in “Taps” menu under “Three finger gestures”, or to another selection of your choice.
Importantly, take note of the command to kill the vncserver in the startup message, and before exiting the container type it in the container prompt. If you don’t do it, you will not be able to open the graphics window next time you use the same container. Then exit the container.
~/CMSSW_5_3_32/src $ vncserver -kill :1 ~/CMSSW_5_3_32/src $ exit
Start the image download and open the container with
docker run -it --name my_od -P -p 5901:5901 cmsopendata/cmssw_5_3_32_vnc:latest /bin/bash
Setting up CMSSW_5_3_32
CMSSW should now be available.
~/CMSSW_5_3_32/src $
This is now a bash shell in the CMS open data environment in which you have access to a complete CMS software release that is appropriate for interfacing with the 2011 and 2012 7 and 8 TeV datasets.
As there are rate limits for pulls from Docker Hub, you may get the following error message: docker: Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading.
. In that case, try later (the limit is per 6 hours) or use the mirror gitlab-registry.cern.ch/cms-cloud/cmssw-docker-opendata/cmssw_5_3_32_vnc
instead of cmsopendata/cmssw_5_3_32_vnc
.
If the docker command exits without giving you the output above, see this post in the CERN Open Data forum (note in particular that the .wslconfig
file that you need to add must not have a file extension, if Windows adds it automatically, rename the file).
Now let’s understand the options that were used for the docker run
command.
- First, the
-it
(or-i
) option means to start the container in interactive mode. Essentially, it means that you will end up inside the running container. - We assign a name to the container using the
--name
switch, so that we can refer back to this environment and still access any files we created in there. You can, of course, choose a different name thanmy_od
. - The options
-P -p 5901:5901
open/publish a port from the container to the local host, needed for the graphical windows cmsopendata/cmssw_5_3_32_vnc:latest
is the name (and:version
) of the image we will use. If no label is prepended, Docker assumes that it resides in Docker Hub, the official image repository of Docker.- Finally, the
/bin/bash
option will throw the container into abash
shell when running interactively.
For a more complete listing of options, see the official Docker documentation on the docker run
command.
Now, first make sure that you can copy instructions from a browser page to the container terminal. It works in the same manner as the local WSL linux terminal, i.e. you can usually copy from other sources with Ctrl+C
and then paste into your container terminal with mouse right click. Copy from the terminal itself by selecting the text to be copied. If this does not work, you will see later in these instructions how to pass files from your local computer to the container.
This container has a VNC application installed to allow opening graphical windows on a remote machine (seen from the container, your own computer is a remote machine). Start the application with start_vnc
from your container prompt, and choose a password. You will need to start it every time you use the container (if you want to open graphics windows), but you will define the password only at the first time.
~/CMSSW_5_3_32/src $ start_vnc
You will require a password to access your desktops.
Password:
Verify:
xauth: file /home/cmsusr/.Xauthority does not exist
New 'myvnc:1' desktop is e0ca768960bf:1
Starting applications specified in /home/cmsusr/.vnc/xstartup
Log file is /home/cmsusr/.vnc/e0ca768960bf:1.log
VNC connection points:
VNC viewer address: 127.0.0.1:5901
OSX built-in VNC viewer command: open vnc://127.0.0.1:5901
To kill the vncserver enter 'vncserver -kill :1'
When you do this the first time, download a VNC viewer to your local machine from TigerVNC. You can then access the GUI in TigerVNC Viewer with the address given in the startup message with the the password you’ve chosen. It opens with an xterminal of your container. If it does not open, it may be that the Windows firewall is blocking it. In that case, check these instructions. To test, start ROOT by typing root
in the container terminal prompt. In the ROOT prompt, type TBrowser t
to open the ROOT graphical window. If the graphical window opens you are all set and you can exit from ROOT either by choosing the “Quit Root” option from Browser menu of the TBrowser window or by typing .q
in the ROOT prompt.
You can copy from the VNC Viewer terminal by selecting with the mouse, and paste to it by a middle mouse button click. If you are using a touchpad, you may need to define “middle mouse button” in Settings -> Devices -> Touchpad. You can set it to a three-finger tap in “Taps” menu under “Three finger gestures”, or to another selection of your choice.
Importantly, take note of the command to kill the vncserver in the startup message, and before exiting the container type it in the container prompt. If you don’t do it, you will not be able to open the graphics window next time you use the same container. Then exit the container.
~/CMSSW_5_3_32/src $ vncserver -kill :1
~/CMSSW_5_3_32/src $ exit
Start the image download and open the container with
docker run -it --name my_od -P -p 5901:5901 cmsopendata/cmssw_5_3_32_vnc:latest /bin/bash
Setting up CMSSW_5_3_32
CMSSW should now be available.
~/CMSSW_5_3_32/src $
This is now a bash shell in the CMS open data environment in which you have access to a complete CMS software release that is appropriate for interfacing with the 2011 and 2012 7 and 8 TeV datasets.
As there are rate limits for pulls from Docker Hub, you may get the following error message: docker: Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading.
. In that case, try later (the limit is per 6 hours) or use the mirror gitlab-registry.cern.ch/cms-cloud/cmssw-docker-opendata/cmssw_5_3_32_vnc
instead of cmsopendata/cmssw_5_3_32_vnc
.
Now let’s understand the options that were used for the docker run
command.
- First, the
-it
(or-i
) option means to start the container in interactive mode. Essentially, it means that you will end up inside the running container. - We assign a name to the container using the
--name
switch, so that we can refer back to this environment and still access any files we created in there. You can, of course, choose a different name thanmy_od
. - The options
-P -p 5901:5901
open/publish a port from the container to the local host, needed for the graphical windows cmsopendata/cmssw_5_3_32_vnc:latest
is the name (and:version
) of the image we will use. If no label is prepended, Docker assumes that it resides in Docker Hub, the official image repository of Docker.- Finally, the
/bin/bash
option will throw the container into abash
shell when running interactively.
For a more complete listing of options, see the official Docker documentation on the docker run
command.
This container has a VNC application installed to allow opening graphical windows on a remote machine (seen from the container, your own computer is a remote machine). Start the application with start_vnc
from your container prompt, and choose a password. You will need to start it every time you use the container (if you want to open graphics windows), but you will define the password only at the first time.
~/CMSSW_5_3_32/src $ start_vnc
You will require a password to access your desktops.
Password:
Verify:
xauth: file /home/cmsusr/.Xauthority does not exist
New 'myvnc:1' desktop is e0ca768960bf:1
Starting applications specified in /home/cmsusr/.vnc/xstartup
Log file is /home/cmsusr/.vnc/e0ca768960bf:1.log
VNC connection points:
VNC viewer address: 127.0.0.1:5901
OSX built-in VNC viewer command: open vnc://127.0.0.1:5901
To kill the vncserver enter 'vncserver -kill :1'
You can access the GUI in the Mac VNC viewer. The first time you do this, enter your computer’s “Settings” menu and turn on “screen sharing” from the “Computer Settings” options, then click on “VNC Viewers” and enter the password you chose. Open the VNC viewer from “Finder” by choosing “connect to server” from the “Go” tab. Paste the “MacOS” address given in the container’s VNC startup message and connect. It opens with an xterminal of your container. To test, start ROOT by typing root
in the container terminal prompt. In the ROOT prompt, type TBrowser t
to open the ROOT graphical window. If the graphical window opens you are all set and you can exit from ROOT either by choosing the “Quit Root” option from Browser menu of the TBrowser window or by typing .q
in the ROOT prompt.
Importantly, take note of the command to kill the vncserver in the startup message, and before exiting the container type it in the container prompt. If you don’t do it, you will not be able to open the graphics window next time you use the same container. Then exit the container.
~/CMSSW_5_3_32/src $ vncserver -kill :1
~/CMSSW_5_3_32/src $ exit
Coming back to the same container
You can come back to the same container you’ve used earlier with the docker start ...
command. Note that running the docker run ...
command as before would create a new container from the image you’ve downloaded. This would be a new environment, and any files that you’ve made or any code that you’ve written before will not be there! To go to the same working area with all our files and code saved each time you will need to start the existing container.
There are two ways to do this: by giving your container instance a name or by making sure you reference the container id. The former approach is probably easier and preferred, but we discuss both below.
Start container by name
The easiest way to start a container that you want to return to is using the name as defined with the --name
option in the docker run ...
command before.
Use -i
(or -it
) for opening the container in interactive mode.
So to re-start
your container
docker start -i my_od
Start container by container ID or by name assigned by Docker
If you did not name your container, you will need to find the container ID or the container name assigned automatically by docker to return to the exact same container as before. First of all, you want to see the list of containers you have locally. To do this, run the following command
docker ps -a
You’ll see a list of containers that may look something like the following (the exact output will vary from user to user).
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4f323c317b90 hello-world "/hello" 3 minutes ago Exited (0) 3 minutes ago modest_jang
7719a7d74190 cmsopendata/cmssw_5_3_32 "/opt/cms/entrypoint…" 9 minutes ago Exited (0) 2 minutes ago happy_greider
8939ade0bfac cmsopendata/cmssw_5_3_32 "/opt/cms/entrypoint…" 16 hours ago Exited (128) 16 hours ago hungry_bhaskara
e914cef3c45a cmsopendata/cmssw_5_3_32 "/opt/cms/entrypoint…" 6 days ago Exited (1) 9 minutes ago beautiful_tereshkova
b3a888c059f7 cmsopendata/cmssw_5_3_32 "/opt/cms/entrypoint…" 13 days ago Exited (0) 13 days ago affectionate_ardinghelli
You can restart a container with the CONTAINER ID
or with the NAME
. In the above example, I know that I’ve been using the most
recent CMS open data container, with 7719a7d74190
as CONTAINER ID
and happy_greider
as NAME
. To restart it, I can run one of the following commands. Note that you
would want to change the CONTAINER ID
or NAME
for your particular case.
docker start -i 7719a7d74190
or
docker start -i happy_greider
Voila! You should be back in the same container.
CHALLENGE! Test persistence
Go into the container and create a test file using some simple shell commands. Type the following exactly as you see it. It will dump some text into a file and then print the contents of the file to the screen
echo "I am still here" > test.tmp cat test.tmp
After you’ve done this, exit out of the container and start it again. If you did it correctly, you should be able to list the contents of the directory with
ls -l
and see your file from before! If not, check that you followed all the instructions above correctly or contact the facilitators.
Copy file(s) into or out of a container
Sometimes you will want to copy a file directly into or out of a container. Let’s start with copying a file out.
Suppose you have created your my_od container and you did the challenge
question above to Test persistence. In your container, there should be a file now
called test.tmp
Run the following on your local machine and not in the container.
It should copy the file out and onto your local machine where you can inspect it.
docker cp my_od:/home/cmsusr/CMSSW_5_3_32/src/test.tmp .
If you want to copy a file into a container instance, it works the way you might expect.
Suppose you have a local file called localfile.tmp
. You can copy it into the same instance
as follows.
docker cp localfile.tmp my_od:/home/cmsusr/CMSSW_5_3_32/src/
Stopping and removing containers
As you are learning how to use Docker, you may find yourself with multiple containers. Or maybe you started a container with your favourite name with some set of flags and now you want use that same name but with new flags. In that case, you will want to stop the container and remove it.
A container stops when you type the exit
command in the container prompt. It may happen that you accidentally close the terminal where the container is running. In that case, the container will not stop and it will remain running. You can list the running containers with docker ps
. You can either return to the container using its name (here “my_od”) with the attach
command on your local machine and then exit normally from the container prompt:
docker attach my_od
exit
or stop the container with
docker stop my_od
To stop all running containers:
docker stop $(docker ps -q)
To remove the container “my_od”, you would type the following. Note that this will delete the container and all files that you may have created in it.
docker rm my_od
To remove all containers:
docker rm $(docker ps -aq)
Don’t worry!
Note that these commands will not remove the actual Docker image that you downloaded and may have taken quite some time to download! Whew!
Mounting a local volume
Sometimes you may want to mount a filesystem from your local machine or some other remote system so that your docker container can see it. Let’s first see how this is done in a general way.
The basic usage is
docker run -v <path on host>:<path in container> <image>
Where the path on host
is the full path to the local file system/directory you want to
make available in the container. The path in container
is where it will be mounted in your
Docker container.
There are more options and if you want to read more, please visit the official Docker documentation.
When working with the CMS open data, you will find yourself using this approach to have a local working directory for all your editing/version control, etc. Note that all your compiling and executing still has to be done in the Docker container! But having your source code also visible on your local laptop/desktop will make things easier for you.
Let’s try this. First, before you start up your container, create a local directory
where you will be doing your code development. In the example below, I’m calling it
cms_open_data_work
and it will live in my $HOME
directory. You may choose a shorter directory name if you like. :)
Local machine
cd # This is to make sure I'm in my home directory mkdir cms_open_data_work
Then fire up your Docker container, adding the following
-v ${HOME}/cms_open_data_work:/home/cmsusr
Follow the example below, depending on your operating system.
Your full docker run ...
command would then look like this:
docker run -it --name my_od --net=host --env="DISPLAY" -v $HOME/.Xauthority:/home/cmsusr/.Xauthority:rw -v ${HOME}/cms_open_data_work:/home/cmsusr/cms_open_data_work cmsopendata/cmssw_5_3_32 /bin/bash
Your full docker run ...
command would then look like this:
docker run -it --name my_od -P -p 5901:5901 -v ${HOME}/cms_open_data_work:/home/cmsusr/cms_open_data_work cmsopendata/cmssw_5_3_32_vnc:latest /bin/bash
Your full docker run ...
command would then look like this:
docker run -it --name my_od -P -p 5901:5901 -v ${HOME}/cms_open_data_work:/home/cmsusr/cms_open_data_work cmsopendata/cmssw_5_3_32_vnc:latest /bin/bash
Setting up CMSSW_5_3_32
CMSSW should now be available.
~/CMSSW_5_3_32/src $
When your Docker container starts up, it puts you in /home/cmsusr/CMSSW_5_3_32/src
, but your new mounted directory is /home/cmsusr/cms_open_data_work
.
The easiest thing to do is to create a soft link to that directory from inside /home/cmsusr/CMSSW_5_3_32/src
using ln -s ...
as shown below,
and then do your work in that directory.
Warning!
Sometimes the local volume is mounted in the Docker container as the wrong user/group. It should be
cmsusr
but sometimes is mounted ascmsinst
. Note that in the following set of commands, we add a line to change the user/group with thechown
command.If this is an issue, you’ll also need to do this in the container for any new directories you check out on your local machine.
Docker container
cd /home/cmsusr/CMSSW_5_3_32/src sudo chown -R cmsusr.cmsusr ~/cms_open_data_work/ # this is only needed if owner of cms_open_data_work is not cmsusr ln -s ~/cms_open_data_work/ cd /home/cmsusr/CMSSW_5_3_32/src/cms_open_data_work/
Now, open a new terminal on your local machine (or simply exit out of your container) and you can use that to check out the git repositories and editing files you’ll be working with. We will see that in the next section.
Key Points
You have now set up a docker container as a working enviroment for CMS open data. You know how to open a graphical window in it and how to pass files between your own computer and the container.