CMS Open Data Workshop 2022


Aug 1-4, 2022

2:30 pm - 6:30 pm (CET)

Instructors: M. Bellis, E. Carrera, A. Geiser, J. Hogan, C. Lange, K. Lassila-Perini, T. McCauley, S. Sekmen, X. Tintin, J. Yoo

Helpers: A. Chicaiza, K. Chicaiza, N. Dhingra, E. Jimenez, K. Johnson, D. Liko, P. LLerena, S. Markham, D. Mena, D. Merizalde, J. Ochoa, E. Piedra,

General Information

Since 2014, the CMS Collaboration has pioneered the release of LHC research quality data for public use by making a significant amount of these data accessible through the CERN Open Data portal. At the end of 2021, the CMS Collaboration released the first batch of its Run 2 data. This workshop is a third of a series that started in 2020 and it aims to bridge the technical gap that usually exists between the scientific creativity of an external analyst and the nuts-and-bolts details of a full analysis with CMS open data. All exercises will be hands-on and participants should be prepared to dive into the data right away. A set of pre-exercises and assignments are provided and required for participants so that they can make the most of the workshop. Time will also be spent brainstorming with attendees about how the entire process of accessing and analyzing the data could be made more useful for the broader HEP community.

Please visit the official Indico site for the workshop.

Who: This workshop is primarily aimed at students and scientists with prior knowledge of collider physics and a deep interest in learning the works and arts of conducting experimental analysis using CMS Open Data.

Where: CERN Laboratory, Geneva, CH. Get directions with OpenStreetMap or Google Maps.

When: Aug 1-4, 2022. Add to your Google Calendar. [Central European Time (CET)].

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a working CMS Open Data Docker container environment as listed in the pre-exercises section.

Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.

Contact: Please email for more information.

Code of Conduct

All workshop participants are expected to follow the CERN Code of Conduct.


All times are in CET (Central European Time)
Help is available through our Mattermost channel and/or the CMS Open Data Forum.


(Mandatory exercises must be completed before the start of the workshop)
(Submission of work assignments is required as explained in the Orientation below)
Mandatory 5 minOrientation
Optional (external lesson)The Unix Shell
Optional (external lesson)Version Control with Git
Optional (external lesson)Programming with Python
Mandatory 2hDocker containers
Mandatory 2hDataset scouting
Mandatory 4hROOT with C++ and Python
Mandatory 2hIntro to CMSSW
Mandatory 2hIntro to cloud computing


14:30-14:50Welcome and IntroK. Lassila-Perini
14:50-15:50Physics Objects: Intro and POETM. Bellis,
E. Carrera,
K. Lassila-Perini
15:50-16:30Physics Objects: ElectronsM. Bellis,
E. Carrera,
K. Lassila-Perini
17:00-17:40Physics Objects: MuonsM. Bellis,
E. Carrera,
K. Lassila-Perini
17:40-18:30Physics Objects: JetsM. Bellis,
E. Carrera,
J. Hogan,
K. Lassila-Perini


14:30-15:30TriggerE. Carrera

15:30-16:30LuminosityJ. Yoo

17:00-18:30Analysis example with Run 1 dataM. Bellis,
A. Geiser


14:30-15:10Simplified Run 2 analysis: IntroE. Carrera,
T. McCauley
15:10-16:30Simplified Run 2 analysis: Coffea analysisE. Carrera,
T. McCauley
17:00-18:30Simplified Run 2 analysis: Systematics and StatsE. Carrera,
T. McCauley


14:30-16:30Cloud ComputingC. Lange,
K. Lassila-Perini,
X. Tintin

17:00-18:30Run 2 analysis with ADLS. Sekmen
18:30-18:45Wrap-up and feedback