@@ -56,43 +56,108 @@ Once the F_MASTER path has been specified the following command can be used to b
...
@@ -56,43 +56,108 @@ Once the F_MASTER path has been specified the following command can be used to b
This script should load all of the required modules from the Compute Canada Software Stack as well as set the environment variable `LD_LIBRARY_PATH` required for running SUMMA-Actors.
This script should load all of the required modules from the Compute Canada Software Stack as well as set the environment variable `LD_LIBRARY_PATH` required for running SUMMA-Actors.
## Running Summa-Actors
## Running Summa-Actors
SUMMA-Actors requires some setup to be configured properly. There is a python script `configuration.py` in the `config` directory to help with the process. See the `Config File` section for more information on how to use the configuration script.
Once the binary is compiled it can be run like the following example command:
SUMMA-Actors has the following options:
```
-g, Specifies the starting index of the first GRU to compute
-n, Specifies the number of GRUs to compute
-c, Specifies the path to the config directory
OPTIONAL:
--config-file, specifies specific configurations for the C++ Actor Framework. For more information about specific configurations see the following [link](https://actor-framework.readthedocs.io/en/stable/ConfiguringActorApplications.html#configuration-files). The most important config option is specifying the maximum number of threads. This can be done with the automatic configuration script, more on this in the next section.
```
The following is an example invocation of SUMMA-Actors:
Once all the required settings of the `Summa_Actors_Settings.json` file have been filled out, running the python script `configuration.py` (the same file as before. If the Summa_Actors_Settings.json file exists this will invoke the setup for running SUMMA-Actors) will create the required slurm submission script as well as the config file for specifying the number of hardware threads for CAF++. This will also create all of the subdirectories necessary for program output and the you should see a file `run_summa.sh` in the build directory that should require no further modification.
-g = starting index of the first GRU to compute
-c = number of grus to run
-m = path to the file manager
OPTIONAL: --config-file = /path/to/config/file
This config file specifies to the C++ Actor Framework how many threads to use when executing the program. If left out the C++ Actor Framework will automatically set this value based on your system.
#### Config File ####
#### Config File ####
SUMMA-Actors settings can be modified from a JSON file provided in config/Summa_Actors_Settings.json.
The configuration file is meant to help automate the job submission process. This does require some initial setup to get working properly. On the first install of SUMMA-Actors the `config` directory should only contain the file `configuration.py`. Running the python file with `python3 configuration.py` will create a file `Summa_Actors_Settings.json`. This file manages the settings and paths for SUMMA-Actors. Below is a detailed explanation of the settings as well as how to use the configuration script to set up a slurm submission script.
There are three types of actors that can be configured:
* SummaActor
- OutputStructureSize = The number of timesteps in which an HRU can hold before needing to contact
the file_access_actor to write the data to a file.
- maxGRUPerJob = The number of GRUs that will be attemtpted to run at once. For example, if this value
is set to 500 and you invoke the program with ./summaMain -g 1 -n 1000 -c /path/to/config/directory.
SUMMA-Actors will only spawn 500 actors at a time and compute all 1000 in two batches.
Both of the above setting control the amount of RAM SUMMA-Actors uses. Larger numbers can cause your
job to run out of memory. We have found that setting both to 500 uses around 20GB of RAM for reference.
* JobActor
- FileManagerPath = Path the the fileManager.txt file needed by SUMMA. This is remained relativley
unchanged from the original version of SUMMA. With two additions. An example file is provided in the
config/ directory called fileManager_example.txt
- outputCSV = Boolean value for if you would like individual HRU run-time statsicts when they complete.
- csvPath = The path that the csv file will be written to.
* HRUActor
- printOutput = Boolean value for if you would like each HRU to print information on where it is in
its computation. ie. what timestep it is on and some other timing information.
- outputFrequency = The frequency in which you would like an HRU printing to stdout. The number specified is the interval in timesteps in which an HRU will print. Note: Lower numbers can see decreased performance as stdout will begin to lag the more that needs to be printed.
SUMMA-Actors settings is categorized into six objects, `configuration`, `JobSubmissionParams`, `SummaActors`, `FileAccessActor`, `JobActor`, and `HRUActor` containing multiple key value pairs.
* Configuration
The configuration object contains all the key:value pairs necessary for creating the fileManager.txt file that is used to specify the settings for SUMMA-Actors. This is very similar to fileManager.txt file used in the original version of SUMMA. Specifying the values in the JSON file will allow the configuration script to create the fileManager.txt file as well as specify the path to the job actor. Below is an example of how the settings should look.
```
"controlVersion": "SUMMA_FILE_MANAGER_V3.0.0",
"simStartTime": "1979-01-01 00:00",
"simEndTime": "2019-12-31 23:00",
"tmZoneInfo": "utcTime",
"settingsPath": "/Path/To/Settings/",
"forcingPath": "/Path/To/Forcing/Data/",
"outputPath": "/Path/To/Output/", #NOTE: Specify a top level directory, subdirectories of slurm, netcdf, csv will be created automatically by the configuration script
"forcingFreq": "month",
"forcingStart": "1979-01-01",
"decisionsFile": "modelDecisions.txt",
"outputControlFile": "outputControl.txt",
"globalHruParamFile": "localParamInfo.txt",
"globalGruParamFile": "basinParamInfo.txt",
"attributeFile": "attributes.nc",
"trialParamFile": "trialParams.nc",
"forcingListFile": "forcingFileList.txt",
"initConditionFile": "coldState.nc",
"outFilePrefix": "SummaActors",
"vegTableFile": "TBL_VEGPARM.TBL",
"soilTableFile": "TBL_SOILPARM.TBL",
"generalTableFile": "TBL_GENPARM.TBL",
"noahmpTableFile": "TBL_MPTABLE.TBL"
```
* JobSubmission
The JobSubmission object contains all the key:value pairs necessary for creating the slurm submission script. The script will be created automatically from the inputted settings with the configuration settings. Below is an example of how the settings should look.
```
"cpus-per-task": 8,
"memory": "32G",
"job-name": "SummaActors",
"account": "Account Name",
"numHRUs": 517315,
"maxNumberOfJobs": 517,
"maxGRUsPerSubmission": 1000,
"executablePath": "/Path/To/SUMMA"
```
SUMMA-Actors is still a work in progress so the specific settings for running jobs optimally may depend on the specifics of your workflow. The testing for SUMMA-Actors has been done with the North American Domain Dataset. For this running jobs of 1000 HRUs with 8 CPUs and 32 GB of Ram has been the most optimal. You are welcome to experiment and find what works best for you.
NOTE: For optimal scheduler performance it is recommended to use 1 core per 4GB of RAM. So if you wanted to use 4 CPUs you should only use 16 GB of RAM. This is not a requirement and more of a guideline. Running SUMMA-Actors with less RAM could result in issues if the number of HRUs per Submission is not adjusted accordingly as well as some other settings in the `SummaActor` settings. See the `general guidelines` section for more information.
* SummaActor
The SummaActor object contains key:value pairs necessary for controlling how large the output structure is as well as how many GRUs run at one time within a job. These are necessary values to control the RAM the program uses. When you submit a job of 1000 GRUs and the `maxGRUPerJob` value is set to 500, SUMMA-Actor will only run 500 GRUs at one time. It will essentially complete all 1000 GRUs in two separate batches, computing one batch of 500 then another batch. Each batch will create their own output file. Below is an example of how these settings should look.
```
"OuputStructureSize": 250, # How many timesteps of data the output structure can hold at once before needing to write the data to a file.
"maxGRUPerJob": 500
```
* FileAccessActor
In order to speed up the write speed of SUMMA-Actors it is necessary to write output in large chunks. The setting in the FileAccessActor controls how many chunks the output structure is divided into. A good rule of thumb is to use as many vectors as there are CPUs, this will ensure optimal performance. Below is an example of what these settings should look like.
```
"num_vectors_in_output_manager": 8
```
* JobActor
These settings control the path to the fileManager.txt file. This is set up automatically by the configuration script. You may also print output as csv. These are specific HRU timings of how long each HRU spent in certain parts of the program. The path will be set up automatically upon running the configuration script. The only setting that needs to be changed is the boolean value to control if output is desired. Below is an example of how these settings should look.
```
"FileManagerPath": "", # SET AUTOMATICALLY
"outputCSV": true,
"csvPath": "" # SET AUTOMATICALLY
```
* HRUActor
These settings control how often if at all the HRU will display output. The frequency is per timestep. If this value is set very low it will slow down the execution of the program. A good value is anything over 10000 for production runs. Below is an example for how these settings should look.
```
"printOutput": true,
"outputFrequency": 50000
```
#### General Guidelines ####
For the optimal performance and the best chance of jobs succeeding it is recommend to use the following settings:
```
"cpus-per-task": 8,
"memory": "32G",
"maxGRUsPerSubmission": 1000,
"maxGRUPerJob": 500
"num_vectors_in_output_manager": 8
"outputFrequency": 50000
```
Feel free to experiment with different values, although you may encounter out of memory errors with your jobs. The amount of RAM used is very dependent on which HRUs are being computing. If you want to use
## Credits
## Credits
The inital implementation of SUMMA is credited to the inital publications below. These
The inital implementation of SUMMA is credited to the inital publications below. These
publications can be found in [Water Resources Research](http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1944-7973).
publications can be found in [Water Resources Research](http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1944-7973).