Run your analysis
Overview
Teaching: 0 min
Exercises: 40 minQuestions
Can you run POET through apptainer in a condor job?
Can you merge the output files?
Can you export the merged files to your laptop?
Objectives
Test running condor jobs for the CMSSW Open Data container
Learn the
hadd
command for merging ROOT filesCopy files out of TIFR onto your personal machine
Let’s submit a job!
If you have logged out of TIFR, log back in and go to your condor script area:
$ ssh userXX@ui3.indiacms.res.in
$ cd condorLite/
One example file list has been created for you. Explore its contents in a text editor or using cat
:
$ cat filelists/DYJetsToLL_13TeV_MINIAODSIM.fls
You will see many ROOT file locations with the ``eospublic” access URL:
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/004544CB-6DD8-E511-97E4-0026189438F6.root
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/0047FF1A-70D8-E511-B901-0026189438F4.root
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/00EB960E-6ED8-E511-9165-0026189438E2.root
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/025286B9-6FD8-E511-BDA0-0CC47A78A418.root
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/02967670-70D8-E511-AAFC-0CC47A78A478.root
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/02AB0BD2-6ED8-E511-835B-00261894393A.root
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/02FE246D-71D8-E511-971A-0CC47A4D761A.root
root://eospublic.cern.ch//eos/opendata/cms/mc/RunIIFall15MiniAODv2/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/10000/0626FEB3-70D8-E511-A5B1-0CC47A4D765A.root
...and more...
Submit condor jobs that will each process the first 5000 events from a list of 2 MiniAOD files. Since this is a small-scale test, cap the number of jobs at 4 (this will not process the entire Drell-Yan dataset). You have two options for submitting these jobs. First, you could reference the file list:
$ python3 scripts/makeCondorJobs.py -f filelists/DYJetsToLL_13TeV_MINIAODSIM.fls --tag DYJetsToLL_v1 -n 2 -j 4 -e 5000 --run_template templates/runScript.tpl.sh
Alternately, you can reference the recid of this Drell-Yan dataset:
$ python3 scripts/makeCondorJobs.py --recid 16446 --tag DYJetsToLL_v1 -n 2 -j 4 -e 5000 --run_template templates/runScript.tpl.sh
To test your script please try out executing one of made jobs : ( we recommend that you do this atleast once before submitting the jobs.)
$ ./Condor/odw_poet/poetV1_DYJetsToLL_v1/Job_1/poetV1_DYJetsToLL_v1_1_run.sh
To submit the jobs to the condor cluster you can either use the manual submission by using condor_submit
command, or add -s
while calling the scripts/makeCondorJobs.py
script, as below
$ python3 scripts/makeCondorJobs.py --recid 16446 --tag DYJetsToLL_v1 -n 2 -j 4 -e 5000 --run_template templates/runScript.tpl.sh -s
You will first be asked to confirm that you really want to submit jobs, because we have included the -s
argument in this command. Type y
to confirm.
If you wish to inspect the submission scripts first, leave off -s
and use the condor_submit
command printed in the output to submit the jobs later.
You will see something like the following output when you submit jobs, with slight differences between the filelist -f
and --recid
submission options:
[userXX@ui3 condorLite]$ python3 scripts/makeCondorJobs.py -f filelists/DYJetsToLL_13TeV_MINIAODSIM.fls --tag DYJetsToLL_v1 -n 2 -j 4 -e 5000 --run_template templates/runScript.tpl.sh -s
Do you really want to submit the jobs to condor pool ? y
Number of jobs to be made 4
Number of events to process per job 5000
Tag for the job DYJetsToLL_v1
Output files will be stored at /home/userXX/condorLite/results/odw_poet/poetV1_DYJetsToLL_v1/
File list to process : filelists/DYJetsToLL_13TeV_MINIAODSIM.fls
Making Jobs in templates/runScript.tpl.sh for files from filelists/DYJetsToLL_13TeV_MINIAODSIM.fls
4 Jobs made !
submit file : /home/userXX/condorLite/Condor/odw_poet/poetV1_DYJetsToLL_v1//jobpoetV1_DYJetsToLL_v1.sub
Condor Jobs can now be submitted by executing :
condor_submit /home/userXX/condorLite/Condor/odw_poet/poetV1_DYJetsToLL_v1//jobpoetV1_DYJetsToLL_v1.sub
Submitting job(s)....
4 job(s) submitted to cluster CLUSTERID. # your CLUSTERID will be a number
Monitoring condor jobs
HTCondor supports many commands that can provide information on the status of job queues and a user’s submitted jobs. Details are availabel in the HTCondor manual for managing a job. Three extremely useful commands are shared here.
To see the status of your jobs:
$ condor_q
-- Schedd: ui3.indiacms.res.in : <144.16.111.98:9618?... @ 01/03/24 10:06:52
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
userXX ID: CLUSTERID 1/3 10:03 _ 5 _ 5 CLUSTERID.JOBIDs
Total for query: 5 jobs; 0 completed, 0 removed, 0 idle, 5 running, 0 held, 0 suspended
Total for userXX: 5 jobs; 0 completed, 0 removed, 0 idle, 5 running, 0 held, 0 suspended
Total for all users: 5 jobs; 0 completed, 0 removed, 0 idle, 5 running, 0 held, 0 suspended
This command shows the cluster identification number for each set of jobs, the submission time, how many jobs are running or idle, and the individual id numbers of the jobs.
To remove a job cluster that you would like to kill:
$ condor_rm CLUSTERID # use CLUSTERID.JOBID to kill a single job in the cluster
All jobs in cluster CLUSTERID have been marked for removal
Job output
These short test jobs will likely only take a few minutes to complete. The submission command output points out the directories that will contain the
condor job information and the eventual output files from the job. You can study the condor job’s executable script, resource use log,
error file, and output file in the newly created Condor
folder:
[userXX@ui3 condorLite]$ ls Condor/odw_poet/poetV1_DYJetsToLL_v1/ # each unique "tag" you provide when submitting jobs will get a unique folder
Job_1 Job_2 Job_3 Job_4 jobpoetV1_DYJetsToLL_v1.sub
[userXX@ui3 condorLite]$ ls Condor/odw_poet/poetV1_DYJetsToLL_v1/Job_1
poetV1_DYJetsToLL_v1_1_run.sh run.1195980.log run.1195980.stderr run.1195980.stdout
The output files can be found in the results
folder:
[userXX@ui3 condorLite]$ ls -lh results/odw_poet/poetV1_DYJetsToLL_v1/
total 32M
-rw-r--r-- 1 user1 user1 7.8M Jan 3 21:47 outfile_1_DYJetsToLL_v1_numEvent5000.root
-rw-r--r-- 1 user1 user1 7.8M Jan 3 21:46 outfile_2_DYJetsToLL_v1_numEvent5000.root
-rw-r--r-- 1 user1 user1 7.9M Jan 3 21:46 outfile_3_DYJetsToLL_v1_numEvent5000.root
-rw-r--r-- 1 user1 user1 7.8M Jan 3 21:47 outfile_4_DYJetsToLL_v1_numEvent5000.root
We can take advantage of the fact that the TIFR cluster also has ROOT installed by default to inspect one of these output files:
[userXX@ui3 condorLite]$ root results/odw_poet/poetV1_DYJetsToLL_v1/outfile_1_DYJetsToLL_v1_numEvent5000.root
------------------------------------------------------------------
| Welcome to ROOT 6.24/08 https://root.cern |
| (c) 1995-2021, The ROOT Team; conception: R. Brun, F. Rademakers |
| Built for linuxx8664gcc on Sep 29 2022, 13:04:57 |
| From tags/v6-24-08@v6-24-08 |
| With c++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44) |
| Try '.help', '.demo', '.license', '.credits', '.quit'/'.q' |
------------------------------------------------------------------
root [0]
Attaching file results/odw_poet/poetV1_DYJetsToLL_v1/outfile_1_DYJetsToLL_v1_numEvent5000.root as _file0...
(TFile *) 0x2b35240
root [1] _file0->ls() # list the contents of the TFile object
TFile** results/odw_poet/poetV1_DYJetsToLL_v1/outfile_1_DYJetsToLL_v1_numEvent5000.root
TFile* results/odw_poet/poetV1_DYJetsToLL_v1/outfile_1_DYJetsToLL_v1_numEvent5000.root
KEY: TDirectoryFile myelectrons;1 myelectrons
KEY: TDirectoryFile mymuons;1 mymuons
KEY: TDirectoryFile mytaus;1 mytaus
KEY: TDirectoryFile myphotons;1 myphotons
KEY: TDirectoryFile mypvertex;1 mypvertex
KEY: TDirectoryFile mygenparticle;1 mygenparticle
KEY: TDirectoryFile myjets;1 myjets
KEY: TDirectoryFile myfatjets;1 myfatjets
KEY: TDirectoryFile mymets;1 mymets
Each of these TDirectoryFile
objects are directories that contain a tree called Events
.
For any of the folders, we can access the tree on the command line for quick tests by
creating a TTree
object:
root [4] TTree *tree = (TTree*)_file0->Get("myelectrons/Events");
root [5] tree->GetEntries() # Confirm the number of events in the tree
(long long) 5000
root [6] tree->Print() # Print out the list of branches available
******************************************************************************
*Tree :Events : Events *
*Entries : 5000 : Total = 2096486 bytes File Size = 893419 *
* : : Tree compression factor = 2.34 *
******************************************************************************
*Br 0 :numberelectron : Int_t number of electrons *
*Entries : 5000 : Total Size= 20607 bytes File Size = 3509 *
*Baskets : 1 : Basket Size= 32000 bytes Compression= 5.72 *
*............................................................................*
*Br 1 :electron_e : vector<float> electron energy *
*Entries : 5000 : Total Size= 100130 bytes File Size = 49609 *
*Baskets : 4 : Basket Size= 32000 bytes Compression= 2.01 *
*............................................................................*
*Br 2 :electron_pt : vector<float> electron transverse momentum *
*Entries : 5000 : Total Size= 100150 bytes File Size = 49382 *
*Baskets : 4 : Basket Size= 32000 bytes Compression= 2.02 *
*............................................................................*
... and more ...
Merge output files
The output files from condor jobs can be many in number, and we might want to combine multiple output root files into a single file.
A sample use case will be an instance when we want to merge the POET output files from a specific dataset into a single file.
We can use ROOT’s hadd
tool to achieve this. Since TIFR has ROOT installed, you can try the hadd
command directly in your results area:
$ hadd results/DYJetsToLL_v1.root results/odw_poet/poetV1_DYJetsToLL_v1/*.root
hadd Target file: DYJetsToLL_v1.root
hadd compression setting for all output: 1
hadd Source file 4: results/odw_poet/poetV1_DYJetsToLL_v1/outfile_1_DYJetsToLL_v1_numEvent5000.root
hadd Source file 5: results/odw_poet/poetV1_DYJetsToLL_v1/outfile_2_DYJetsToLL_v1_numEvent5000.root
hadd Source file 6: results/odw_poet/poetV1_DYJetsToLL_v1/outfile_3_DYJetsToLL_v1_numEvent5000.root
hadd Source file 8: results/odw_poet/poetV1_DYJetsToLL_v1/outfile_5_DYJetsToLL_v1_numEvent5000.root
hadd Target path: DYJetsToLL_v1.root:/
hadd Target path: DYJetsToLL_v1.root:/myelectrons
hadd Target path: DYJetsToLL_v1.root:/mymuons
hadd Target path: DYJetsToLL_v1.root:/mytaus
hadd Target path: DYJetsToLL_v1.root:/myphotons
hadd Target path: DYJetsToLL_v1.root:/mypvertex
hadd Target path: DYJetsToLL_v1.root:/mygenparticle
hadd Target path: DYJetsToLL_v1.root:/myjets
hadd Target path: DYJetsToLL_v1.root:/myfatjets
hadd Target path: DYJetsToLL_v1.root:/mymets
This commad will produce a root file, DYJetsToLL_v1.root, merging the trees available inside all the files matching results/odw_poet/poetV1_DYJetsToLL_v1/*.root
What if my cluster doesn’t have ROOT?
If your cluster does not have ROOT installed, which includes the
hadd
command, you can use the ROOT docker container viaapptainer
. To access thehadd
command from the ROOT container, we launch a container instance interactively:$ apptainer shell --bind results/:/results docker://gitlab-registry.cern.ch/cms-cloud/root-vnc:latest
Here we mount the
results
folder as/results
folder inside the container. Now we are able to execute any command available in the docker container:Apptainer $ hadd /results/DYJetsToLL_v1.root /results/odw_poet/poetV1_DYJetsToLL_v1/*.root
My output files are large
POET output files can be easily be many MB, scaling with the number of events processed. If your files are very large, merging them is not recommended – it requires additional storage space in your account (until the unmerged files can be deleted) and can make transfering the files out very slow.
Copy output files out of TIFR
As we’ve shown in earlier lessons, analysis of the POET ROOT files can be done with ROOT or Python tools, typically on your local machine.
To extract files from TIFR for local analysis, use the scp
command:
$ scp -r userXX@ui3.indiacms.res.in:/home/userXX/condorLite/results/odw_poet/poetV1_DYJetsToLL_v1/ .
user1@ui3.indiacms.res.in's password:
outfile_2_DYJetsToLL_v1_numEvent5000.root 100% 7977KB 2.5MB/s 00:03
outfile_3_DYJetsToLL_v1_numEvent5000.root 100% 7997KB 5.3MB/s 00:01
outfile_1_DYJetsToLL_v1_numEvent5000.root 100% 7979KB 5.8MB/s 00:01
outfile_4_DYJetsToLL_v1_numEvent5000.root 100% 7968KB 6.0MB/s 00:01
You can also transfer individual files, such as a single merged file per dataset. Now you’re ready to dive in to your physics analysis!
Key Points
The
hadd
command allows you to merge ROOT files that have the same internal structureFiles can be extracted from TIFR to your local machine using
scp
You can then analyze the POET ROOT files using other techniques from this workshop