Summary and Schedule
This lesson is designed to teach you how to explore available CMS datasets on the CERN Open Data portal. You will find the primary datasets in which collision data were directed when the data were taken, and simulated Monte Carlo samples that are available for the run period you are interested in.
You’ll also be shown how to do a first-order inspection of some of these datafiles, just to see what is stored in them.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Introduction |
What is the point of these exercises? How do I find the data I want to work with? |
Duration: 00h 10m | 2. Where are the datasets? | Where do I find datasets for data and Monte Carlo? |
Duration: 00h 20m | 3. What data and Monte Carlo are available? |
What data and run periods are available? What data do the collision datasets contain? What Monte Carlo samples are available? |
Duration: 00h 35m | 4. How to access metadata on the command line? |
What is cernopendata-client? How to use cernopendata-client container image? How to get the list of files in a dataset on the command line? |
Duration: 00h 50m | 5. What is in the datafiles? | How do I inspect these files to see what is in them? |
Duration: 01h 00m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
First episodes of this lesson can be worked through in a web browser. Episodes 4 and 5 use Docker containers to inspect and download open data files and their metadata, and it will require using a Unix terminal (Linux, MacOS terminal, or Windows WSL2 Ubuntu terminal) with Docker installed.