Training DQM Tool for CSC

git clone https://github.com/Ma128-bit/ML4DQM.git

Get Monitoring elements (ME)

Requires a working conda installation

conda create --name MEProd python=3.9
conda activate MEProd
pip3 install -r requirementsMEProd.txt 
chmod +x submit.sh

Use Submit_getMEwithDIALS.py to submit (with condor) the code, based on dials_api, that gets the MEs. List of arguments:

Argument	Default	Required	Description
`-w / --workspace`	csc	False	DIALS-workspace
`-m / --menames`		True	One or a list of monitoring elements
`-d / --dtnames`		False	One or a list of dataset elements (if None takes all the possible datasets)
`-t / --metype`		True	Type of monitoring elements h1d or h2d
`-o / --outputdir`	test	False	Output directory
`-c / --conda`	MEProd	False	Conda environment name
`-p / --miniconda_path`		True	Path to the miniconda installation directory
`--min_run`		True	Minimum run (Not required if `--era`)
`--max_run`		True	Maximum run (Not required if `--era`)
`--max_splits`	16	False	Number of splits per ME
`--era`		False	Automatically select the min and max run according to the chosen era (ex: Run2024D)

Usage example:

python3 Submit_getMEwithDIALS.py -m CSC/CSCOfflineMonitor/recHits/hRHGlobalm4 -t h2d -p /lustrehome/mbuonsante/miniconda3 \
-c MEProd --era Run2024E --n_splits 20 --outputdir hRHGlobalm4E

To ensure that all the jobs have finished, use:

grep "Done:" "outputdir"/log/*.out | wc -l

Note:

If you get the error:

ImportError: cannot import name 'MutableMapping' from 'collections'

Modify classad/_expression.py changing from collections import MutableMapping with from collections.abc import MutableMapping

Main Workflow

It is split into 5 steps, from S0 to S4, listed below. The first time you run the code, you need to perform:

python3
import Utilities.database as db
db.init_db()

S0: Fetch image info

For CONDA users:

conda create --name PrePro python=3.9
conda activate PrePro
pip3 install -r requirementsPrePro.txt

For SWAN notebook users:

There is no need to follow the steps above. You only need to install oms-api-client and runregistry_api_client (as below) and import them as:

import sys
sys.path.append('run registry site')
sys.path.append('./oms-api-client')

where run registry site is obtained usign: pip show runregistry

Note: since in SWAN you are using an LCG release that you are note allowed to modify, you have to:

1 - Install the packages in user scope with pip install omsapi runregistry --user

2 - Make sure that your SWAN session has access to packages installed in your /eos are by checking the "Use Python packages installed on CERNBox" box when starting your session

For all users:

Follow the "Authentication Prerequisites" instructions on runregistry_api_client. Then follow oms-api-client instructions. (You can use the same application for both runregistry and oms) Save the oms application credentials in a file named config.yaml with this structure:

APIClient:
    client_ID: 'id_example'
    Client_Secret: 'secret_example'

conda activate PrePro

Run: python3 S0_GetInfo.py. List of arguments:

Argument	Default	Required	Description
`-p / --path`		True	ME parquet file location
`-m / --mename`	hRHGlobalm2	True	One monitoring element.
`--conda_env`		True	Path to conda env (like ~/miniconda3/envs/oms)

Example: python3 S0_GetInfo.py -m "hRHGlobalm2" -p "../ML4DQM/MEs/" --conda_env ~/miniconda3/envs/oms

Returns a job_id!!

S1: Pre-Processing and sum of consecutive LSs

conda activate PrePro
python3 hltScale.py --job_id XXXXXXX
python3 S1_PreProcessing.py --job_id XXXXXXX

S2: Train Autoencoder

ON SWAN/LXPLUS:

Run it via the SWAN terminal, or source /cvmfs/sft.cern.ch/lcg/views/LCG_108_swan/x86_64-el9-gcc13-opt/setup.sh on lxplus

For NON CERN users:

conda create --name pytorch python=3.9
conda activate pytorch
pip3 install -r requirementsTraining.txt

python3 S2_training.py --job_id XXXXXXX --ring ["in" or "out"]

S3: Generate fake anomalies

(same environment as S2)

python3 S3_GenerateAnomaly.py --job_id XXXXXXX --ring ["in" or "out"]

S4: Study model performance with the fake anomalies

(same environment as S2)

python3 S4_performance.py --job_id XXXXXXX --ring ["in" or "out"]

Optional argument: --anomalysample path to a custom set of anomalies

Important: If you run the same step twice (for example, after making some changes), it will return a new job_id, and you will need to use this new job_id for the subsequent steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Training DQM Tool for CSC

Get Monitoring elements (ME)

Main Workflow

S0: Fetch image info

S1: Pre-Processing and sum of consecutive LSs

S2: Train Autoencoder

S3: Generate fake anomalies

S4: Study model performance with the fake anomalies

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
Utilities		Utilities
README.md		README.md
S0_GetInfo.py		S0_GetInfo.py
S1_PreProcessing.py		S1_PreProcessing.py
S2_training.py		S2_training.py
S3_GenerateAnomaly.py		S3_GenerateAnomaly.py
S4_performance.py		S4_performance.py
Submit_getMEwithDIALS.py		Submit_getMEwithDIALS.py
chamber_config.yaml		chamber_config.yaml
get_MEs.py		get_MEs.py
hltScale.py		hltScale.py
hltsclae.json		hltsclae.json
requirementsMEProd.txt		requirementsMEProd.txt
requirementsPrePro.txt		requirementsPrePro.txt
requirementsTraining.txt		requirementsTraining.txt
submit.sh		submit.sh

Ma128-bit/ML4DQM

Folders and files

Latest commit

History

Repository files navigation

Training DQM Tool for CSC

Get Monitoring elements (ME)

Main Workflow

S0: Fetch image info

S1: Pre-Processing and sum of consecutive LSs

S2: Train Autoencoder

S3: Generate fake anomalies

S4: Study model performance with the fake anomalies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages