Experiment helpers
utilities.experiment_helpers.experiment_launcher
Experiment launcher script.
This script helps running multiple experiments by taking a list of commands and executing them with a process pool. This way, a huge list of experiments can be left running unattended.
Display help message to run the code:
python experiment_launcher.py --help
Displays all the relevant arguments that can be used.
Authors:
Alberto Garcia Garcia (garciagarcia@ice.csic.es)
log_experiment(process_result)
Callback to log all the info returned from an experiment run.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
process_result
|
Tuple[Path, str]
|
Tuple containing the process experiment command and the whole process output to console string. |
required |
Source code in utilities/experiment_helpers/experiment_launcher.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
main(args)
Execute different experiments using a multiprocessing pool.
This function initializes a multiprocessing pool to run several scripts defined by the user. It queues the different jobs, and manages their execution in parallel.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args
|
Namespace
|
Command-line arguments parsed by argparse. Required parameters include:
|
required |
Source code in utilities/experiment_helpers/experiment_launcher.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
run_experiment(command)
Run experiment command.
This is the main routine for running a particular experiment. It runs the provided experiment command (a Python call to the experiment script with a set of CLI arguments) and captures all the output of the process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
command
|
str
|
Full command to execute the experiment. |
required |
Returns:
| Type | Description |
|---|---|
Tuple[Path, str]
|
The experiment command and the convoluted output of the process. |
Source code in utilities/experiment_helpers/experiment_launcher.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | |
setup_process_pool(event, lock)
Set up the process pool for multiprocessing with a global pause/resume event.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
event
|
Event
|
Reference to a master process event that will signal the child processes to pause or resume execution. |
required |
lock
|
lock
|
A reference to a master process lock that will coordinate the child process launching with waiting times. |
required |
Source code in utilities/experiment_helpers/experiment_launcher.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
utilities.experiment_helpers.parameter_set_generator
Parameter-set generator module.
This module contains the methods necessary to produce a sweep of the simulation parameters, used in the parameter_sweeper.py and the run_simulation_set.py script.
If the --sampling_type argument is set to "grid", we require the following for each tunable parameter:
--argument low high count
This expands the parameter to a linspace between [low, high] with a "count" number of steps.
If the --sampling_type argument is set to "random", we require the following for each tunable parameter:
--argument low high
This expands the parameter to a list of values between [low, high] drawn from a uniform distribution. In this case, the number of values to be drawn for each parameter is specified by the argument --sampling_size.
Both expansion types are evaluated for each specified argument. Subsequently, a generator produces all possible parameter combinations if in "grid" mode or sets of random parameter values if in "random" mode.
Authors:
Michele Ronchi (ronchi@ice.csic.es)
check_expand_args(args_dict)
Check if the parsed input arguments are coherent and have the correct shape. If in grid mode: expand each simulation parameter in linear space in the specified ranges. If in random mode: draw random set of parameter values from uniform distributions in the specified ranges.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args_dict
|
dict
|
Dictionary of the parsed argument via CLI. |
required |
Returns:
| Type | Description |
|---|---|
Tuple[list, list]
|
A list containing the expanded ranges of the parameters and a list containing the names of the expanded parameters. |
Source code in utilities/experiment_helpers/parameter_set_generator.py
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 | |
check_parameter_compatibility(args_dict)
Check if the parsed input arguments are coherent with the ones provided in the configuration file of the simulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args_dict
|
dict
|
Dictionary of the parsed argument via CLI. |
required |
Source code in utilities/experiment_helpers/parameter_set_generator.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | |
expand_parameter(args_dict, parameter_name, range_values)
Expand the simulation parameters. If in grid mode: expand each simulation parameter in linear space in the specified ranges. If in random mode: draw random set of parameter values from uniform distributions in the specified ranges.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args_dict
|
dict
|
Dictionary of the parsed argument via CLI. |
required |
parameter_name
|
str
|
Name of the parameter to expand. |
required |
range_values
|
list
|
Range of values where to expand the parameter. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
A list containing the expanded range of the parameter. |
Source code in utilities/experiment_helpers/parameter_set_generator.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 | |
utilities.experiment_helpers.parameter_sweeper
Parameter-sweeper script.
This script generates the files necessary to launch multiple simulations with different parameter values using HTCondor at the PIC. It uses methods from the module parameter_set_generator.py.
If the --sampling_type argument is set to "grid", we require the following for each tunable parameter:
--argument low high count
This expands the parameter to a linspace between [low, high] with a "count" number of steps.
If the --sampling_type argument is set to "random", we require the following for each tunable parameter:
--argument low high
This expands the parameter to a list of values between [low, high] drawn from a uniform distribution. In this case, the number of values to be drawn for each parameter is specified by the argument --sampling_size.
Both expansion types are evaluated for each specified argument. Subsequently, a generator produces all possible parameter combinations if in "grid" mode or sets of random parameter values if in "random" mode. Each set will be saved in a JSON "parameter_override" file that will be used as input to a simulation.
Display help message to run the code:
python parameter_sweeper.py --help
Displays all the relevant arguments that can be used.
Authors:
Michele Ronchi (ronchi@ice.csic.es)
main(args)
Generate parameter sets for running simulations based on provided arguments.
This function takes command-line arguments, parses them, and generates parameter sets for running simulations. It supports two types of sampling: grid and random. The arguments to run the simulations are saved in a text file, and override JSON files are created for each simulation containing the corresponding generated parameter sets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args
|
Namespace
|
An argparse.Namespace object containing the following attributes:
|
required |
Source code in utilities/experiment_helpers/parameter_sweeper.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | |
utilities.experiment_helpers.run_simulation_set
Simulator helper script.
This script allows us to run the various simulator utilities in a multithreaded way.
If the --sampling_type argument is set to "grid", we require the following for each tunable parameter:
--argument low high count
This expands the parameter to a linspace between [low, high] with a "count" number of steps.
If the --sampling_type argument is set to "random", we require the following for each tunable parameter:
--argument low high
This expands the parameter to a list of values between [low, high] drawn from a uniform distribution. In this case, the number of values to be drawn for each parameter is specified by the argument --sampling_size.
Both expansion types are evaluated for each specified argument. Subsequently, a generator produces all possible parameter combinations if in "grid" mode or sets of random parameter values if in "random" mode. Each parameter combination will spawn a new process that enters a multithreaded pool for later execution, allowing the asynchronous simulation of many populations in parallel with a defined maximum number of threads.
NOTE: if an error occurs in one of the simulations, the script will not stop until all the processes will be terminated. The error will be only shown on the terminal in this case.
Display help message to run the code:
python run_simulation_set.py --help
Displays all the relevant arguments that can be used.
Authors:
Alberto Garcia Garcia (garciagarcia@ice.csic.es)
Michele Ronchi (ronchi@ice.csic.es)
Celsa Pardo Araujo (pardo@ice.csic.es)
log_error(e)
Log an exception raised during the simulation process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
e
|
Exception
|
The exception to log. |
required |
Source code in utilities/experiment_helpers/run_simulation_set.py
69 70 71 72 73 74 75 76 | |
log_simulation(process_result)
Callback to log all the info returned from a simulation run.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
process_result
|
Tuple[Path, str]
|
Tuple containing the process simulation command and the whole process output to console string. |
required |
Source code in utilities/experiment_helpers/run_simulation_set.py
255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 | |
main(args)
Execute parameterized simulations using a multiprocessing pool.
This function initializes a multiprocessing pool to run simulations based on a set of parameters defined by the user. It parses the command-line arguments, expands parameter ranges for sampling, queues the simulations, and manages their execution in parallel. It also generates JSON files for each simulation that contain the parameters used for that run.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args
|
Namespace
|
Command-line arguments parsed by argparse. Required parameters include:
|
required |
Source code in utilities/experiment_helpers/run_simulation_set.py
291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 | |
robust_run_simulation_dask(*args, max_attempts=3, delay=5, **kwargs)
This function wraps around the run_simulation_dask function, adding retry logic to handle transient issues (e.g., connection problems). If an error occurs during the run_simulation_dask function, it will retry the operation a specified number of times with a delay between each attempt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args
|
Any
|
Variable length argument list. |
()
|
max_attempts
|
int
|
Maximum number of retry attempts. Default is 3. |
3
|
delay
|
int
|
Delay between retry attempts in seconds. Default is 5. |
5
|
kwargs
|
Any
|
Arbitrary keyword arguments. |
{}
|
Source code in utilities/experiment_helpers/run_simulation_set.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
run_simulation(command)
Run simulation command.
This is the main routine for running a particular simulation. It runs the provided simulation command (a Python call to the simulation script with a set of CLI arguments) and captures all the output of the process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
command
|
str
|
Full command to execute the simulation. |
required |
Returns:
| Type | Description |
|---|---|
Tuple[Path, str]
|
The simulation command and the output of the process. |
Source code in utilities/experiment_helpers/run_simulation_set.py
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | |
run_simulation_dask(args, simulator_type, simulation_output_path, simulation_override_json, dyn_data_path)
Run the simulation command, copying the output folder to the node before execution and back to the original location
afterward to prevent overload at PIC. Unlike the run_simulation function below, this function does not capture all the
terminal output of the process. This function is necessary for running train_tsnpe.py using Dask and HTCondor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args
|
Namespace
|
Arguments required for the simulation, including output directory, parameter overrides, and optional dynamic data path. |
required |
simulator_type
|
str
|
The type of simulator to use, determining the specific simulation script to run. |
required |
simulation_output_path
|
str
|
Path to the simulation output folder. |
required |
simulation_override_json
|
dict
|
Dictionary with the parameter values for the override.json file. |
required |
dyn_data_path
|
str
|
Dynamical database path. |
required |
Source code in utilities/experiment_helpers/run_simulation_set.py
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |
safe_copytree(src, dst, max_attempts=3, delay=5)
Safely copy a directory tree with retries. This function attempts to copy a directory tree from the source path to the destination path. If the copy operation fails (e.g., due to connection issues), it will retry the operation a specified number of times with a delay between each attempt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
src
|
str
|
Source directory path. |
required |
dst
|
str
|
Destination directory path. |
required |
max_attempts
|
int
|
Maximum number of retry attempts. Default is 3 retries. |
3
|
delay
|
int
|
Delay between retry attempts in seconds. Default is 5 seconds. |
5
|
Source code in utilities/experiment_helpers/run_simulation_set.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
setup_process_pool(event, lock)
Set up the process pool for multiprocessing with a global pause/resume event.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
event
|
Event
|
Reference to a master process event that will signal the child processes to pause or resume execution. |
required |
lock
|
Lock
|
A reference to a master process lock that will coordinate the child process launching with waiting times. |
required |
Source code in utilities/experiment_helpers/run_simulation_set.py
273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 | |
utilities.experiment_helpers.run_simulation_set_sbi
Simulator helper script.
This script allows us to run various simulator scripts in a multithreaded manner. Unlike the run_simulation_set.py
script, which sample parameters randomly or on a grid, this script follows a prior distribution for parameter
sampling. Note that this script can be run only when using a prior distribution from the sbi package that has the
.sample() method available.
The number of values to be drawn for each parameter is specified by the argument --sampling_size.
Each parameter combination will spawn a new process when the simulator_multiprocess function is called.
These processes enter a multithreaded pool for later execution, allowing the asynchronous simulation of many
populations in parallel with a defined maximum number of threads. However, when the simulator_dask function is
called, the multithreading is handled with Dask, enabling parallel execution of simulations across the Dask cluster
in HTCondor.
NOTE: if an error occurs in one of the simulations, the script will not stop until all the processes have been terminated. The error will be only shown in the terminal in this case.
Display help message to run the code:
python run_simulation_set_sbi.py --help
Displays all the relevant arguments that can be used.
Authors:
Celsa Pardo Araujo (pardo@ice.csic.es)
Michele Ronchi (ronchi@ice.csic.es)
initialize_dask_cluster(logger, config)
Initialize a Dask cluster for distributed computing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
logger
|
Logger
|
Logger object. |
required |
config
|
ConfigurationParser
|
Configuration object specifying dataset loading parameters. |
required |
Returns:
| Type | Description |
|---|---|
HTCondorCluster
|
Initialized Dask cluster object. |
Source code in utilities/experiment_helpers/run_simulation_set_sbi.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
sample_without_nan(distribution, sampling_size, device, max_attempts=20)
Sample a distribution while removing NaN values from the sampled outputs. Stops after max_attempts if sufficient valid samples are not obtained.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
distribution
|
Any
|
The distribution to sample from. |
required |
sampling_size
|
int
|
The number of samples to draw from the distribution. |
required |
device
|
device
|
Device used to run the script. |
required |
max_attempts
|
int
|
The maximum number of attempts to sample (default is 20). |
20
|
Returns:
| Type | Description |
|---|---|
Tensor
|
torch.Tensor: A tensor of samples where all NaN values have been removed. |
Source code in utilities/experiment_helpers/run_simulation_set_sbi.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
simulator_dask(args_dict, prior, dataset, device)
Execute simulations based on the provided prior distribution in a multithreaded manner using the Dask package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args_dict
|
Dictionary
|
Dictionary with the arguments. |
required |
prior
|
DirectPosterior
|
Prior distribution. |
required |
dataset
|
DatasetMultichannelArray
|
Stores statistics and scaling information used in the prior distribution. |
required |
device
|
device
|
Device used to run the script. |
required |
Source code in utilities/experiment_helpers/run_simulation_set_sbi.py
170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 | |
simulator_multiprocess(args_dict, prior, dataset, device)
Execute simulations based on the provided prior distribution in a multithreaded manner using the package multiprocessing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args_dict
|
Dictionary
|
Dictionary with the arguments. |
required |
prior
|
DirectPosterior
|
Prior distribution. |
required |
dataset
|
DatasetMultichannelArray
|
Stores statistics and scaling information used in the prior distribution. |
required |
device
|
device
|
Device used to run the script. |
required |
Source code in utilities/experiment_helpers/run_simulation_set_sbi.py
291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 | |