Skip to content
Snippets Groups Projects
Commit a3cf590f authored by Kyle Klenk (kck540)'s avatar Kyle Klenk (kck540)
Browse files

Update README.md

parent a9b25f4d
No related branches found
No related tags found
1 merge request!2Update README.md
......@@ -4,160 +4,87 @@ SUMMA-Actors is a modified version of the already existing SUMMA software that c
found [here](https://github.com/CH-Earth/summa#readme). SUMMA-Actors uses the Actor Model to increase scalability and fault-tolerance. It is built using the [C++ Actor Framework](https://github.com/actor-framework/actor-framework).
## Documentation
A more in-depth documentation can be found [here](https://summa-actors.readthedocs.io/en/latest/)
A more in-depth documentation can be found [here](https://git.cs.usask.ca/numerical_simulations_lab/actors/Summa-Actors/-/wikis/home). SUMMA-Actors depends on many files from the [original SUMMA repo](https://github.com/CH-Earth/summa). Below is a quick start guide for compiling and running SUMMA-Actors, please consult our wiki for more in-depth documentation. Consider creating an issue for any missing documentaion, asking questions, or providing suggestions.
## Directory Structure
SummaActors is set up with the following sub-directoies, we will consider the top level Summa Actors directory the `root_dir`:
- bin
- build
- cmake
- includes
- makefiles
- source
- summa (https://github.com/CH-Earth/summa)
- utils
- README.md
First clone Summa-Actors to your workstation. Then cd into `build/` and clone `summa` or `summa-sundials` into Summa-Actor's build directory.
## Compiling Summa-Actors
SUMMA-Actors is written in C++ and FORTRAN and is currently tested for the following compilers:
The best way to compile SUMMA-Actors is with `cmake`. The `CMakeLists.txt` is found in `build/cmake/` When compiling Summa-Actors there are two options, compiling with the sundials library and compiling without. Which compilation instructions to use are deffined by the following line inside the cmake file:
option(SUNDIALS "Use SUNDIALS" ON)
ON compiles with Sundials, OFF compiles without sundials.
Summa-Actors has the following dependencies:
* g++
* gfortran
SUMMA-Actors depends on the following Libraries:
* [NetCDF-Fortran](https://github.com/Unidata/netcdf-fortran)
* [OpenBLAS](https://github.com/xianyi/OpenBLAS)
* [C++ Actor Framework](https://github.com/actor-framework/actor-framework)
Once the following libraries have been installed SUMMA-Actors can be compiled in
one of two ways. The first way is to modify the Makefile directly and the second
is to invoke the makefile by shellscript:
### Method 1: Makefile
The method is best used for a workstation build that does not have access to a Compute Canada Software Stack where includes and libraries will need to be explicitly specified.
The Makefile is located in the `build` directory and the following variables will need to be uncommented and changed:
- F_MASTER = Path/to/Summa-Actors/ # this is the directory above build/
- FC = gfortran
- CC = g++
- INCLUDES = Path/to/netcdf/includes
- LIBRARIES = Path/to/netcdf/lib
Path/to/openblas
-lnetcdff -lopenblas
- ACTORS_INCLUDES = $INCLUDES
Path/to/CAF/includes
- ACTORS_LIBRARIES = $LIBRARIES
PATH/to/CAF/lib
$(F_MASTER)/bin
-lcaf_core -lcaf_io -lsumma -lopenblas -lnetcdff
After the following SUMMA-Actors can be compiled with `make`.
Once compiled you will need to set the library path variable with the following command (replace F_Master with the full path to the SUMMA-Actors directory):
`export LD_LIBRARY_PATH=/F_MASTER/bin`
See section Running SUMMA-Actors for instructions on how to use the program.
* [C++ Actor Framework](https://github.com/actor-framework/actor-framework)
Once the following libraries have been installed and the proper Sundials setting is set use complete the following:
- cd into `build/cmake/`
- create another build directory
- cd into `build/cmake/build`
- run `cmake ..`
- run `make`
- a `summa_actors` executable should have been created inside the `bin/` directory
## Running SUMMA-Actors
Running SUMMA-Actors is done with the following command:
./summa_actor -g startGRU -n numGRU -c path_to_config_file
The values for -g and -n are integers where -c is the full path to the configuraiton file for summa actors.
The configuration file is a json file. The contents of the JSON file are below:
{
"Distributed_Settings": {
"distributed_mode": false,
"servers_list": [{"hostname": "simlab01"}, {"hostname": "simlab05"}],
"port": 4444,
"total_hru_count": 517315,
"num_hru_per_batch": 1000
},
"Summa_Actor": {
"max_gru_per_job": 500
},
"File_Access_Actor": {
"num_partitions_in_output_buffer": 4,
"num_timesteps_in_output_buffer": 500
},
### Method 2: Shell Script
This method is best used for cluster environments that have access to the Compute Canada Software Stack. This method has been tested on the University of Saskatchewan's Copernicus and Compute Canada's Graham. This method will invoke the Makefile and no modifications should be made to the Makefile for this method.
The compilation script is located in the `build` as `compile_summa.sh`. The only variable within the compile script that needs to be changed is the `F_MASTER` path. This is the absolute path to the SUMMA-Actors directory.
"Job_Actor": {
"file_manager_path": "/gladwell/kck540/Sundials_Settings/fileManager_actors.txt",
"output_csv": false,
"csv_path": ""
},
Once the F_MASTER path has been specified the following command can be used to build SUMMA-Actors from the build directory:
`source compile_summa.sh`
"HRU_Actor": {
"print_output": true,
"output_frequency": 1000
}
}
This script should load all of the required modules from the Compute Canada Software Stack as well as set the environment variable `LD_LIBRARY_PATH` required for running SUMMA-Actors.
The settings above should work for most use cases, some of the feautures we want to automate such as max_gru_per_job. However, the only field that you should have to adjust is the `file_manager_path`. This is the path to the file that manages the complete configuration of the SUMMA simulation. The summa confiuration files are explained in more depth in the follwoing (documentation)[https://summa.readthedocs.io/en/latest/input_output/SUMMA_input/]
## Running Summa-Actors
SUMMA-Actors requires some setup to be configured properly. There is a python script `configuration.py` in the `config` directory to help with the process. See the `Config File` section for more information on how to use the configuration script.
SUMMA-Actors has the following options:
```
-g, Specifies the starting index of the first GRU to compute
-n, Specifies the number of GRUs to compute
-c, Specifies the path to the config directory
OPTIONAL:
--config-file, specifies specific configurations for the C++ Actor Framework. For more information about specific configurations see the following [link](https://actor-framework.readthedocs.io/en/stable/ConfiguringActorApplications.html#configuration-files). The most important config option is specifying the maximum number of threads. This can be done with the automatic configuration script, more on this in the next section.
```
The following is an example invocation of SUMMA-Actors:
`./summaMain -g 1 -n 10 -c /path/to/config/directory --config-file=/path/to/actors/config/file`
Once all the required settings of the `Summa_Actors_Settings.json` file have been filled out, running the python script `configuration.py` (the same file as before. If the Summa_Actors_Settings.json file exists this will invoke the setup for running SUMMA-Actors) will create the required slurm submission script as well as the config file for specifying the number of hardware threads for CAF++. This will also create all of the subdirectories necessary for program output and the you should see a file `run_summa.sh` in the build directory that should require no further modification.
#### Config File ####
The configuration file is meant to help automate the job submission process. This does require some initial setup to get working properly. On the first install of SUMMA-Actors the `config` directory should only contain the file `configuration.py`. Running the python file with `python3 configuration.py` will create a file `Summa_Actors_Settings.json`. This file manages the settings and paths for SUMMA-Actors. Below is a detailed explanation of the settings as well as how to use the configuration script to set up a slurm submission script.
SUMMA-Actors settings is categorized into six objects, `configuration`, `JobSubmissionParams`, `SummaActors`, `FileAccessActor`, `JobActor`, and `HRUActor` containing multiple key value pairs.
* Configuration
The configuration object contains all the key:value pairs necessary for creating the fileManager.txt file that is used to specify the settings for SUMMA-Actors. This is very similar to fileManager.txt file used in the original version of SUMMA. Specifying the values in the JSON file will allow the configuration script to create the fileManager.txt file as well as specify the path to the job actor. Below is an example of how the settings should look.
```
"controlVersion": "SUMMA_FILE_MANAGER_V3.0.0",
"simStartTime": "1979-01-01 00:00",
"simEndTime": "2019-12-31 23:00",
"tmZoneInfo": "utcTime",
"settingsPath": "/Path/To/Settings/",
"forcingPath": "/Path/To/Forcing/Data/",
"outputPath": "/Path/To/Output/", #NOTE: Specify a top level directory, subdirectories of slurm, netcdf, csv will be created automatically by the configuration script
"forcingFreq": "month",
"forcingStart": "1979-01-01",
"decisionsFile": "modelDecisions.txt",
"outputControlFile": "outputControl.txt",
"globalHruParamFile": "localParamInfo.txt",
"globalGruParamFile": "basinParamInfo.txt",
"attributeFile": "attributes.nc",
"trialParamFile": "trialParams.nc",
"forcingListFile": "forcingFileList.txt",
"initConditionFile": "coldState.nc",
"outFilePrefix": "SummaActors",
"vegTableFile": "TBL_VEGPARM.TBL",
"soilTableFile": "TBL_SOILPARM.TBL",
"generalTableFile": "TBL_GENPARM.TBL",
"noahmpTableFile": "TBL_MPTABLE.TBL"
```
* JobSubmission
The JobSubmission object contains all the key:value pairs necessary for creating the slurm submission script. The script will be created automatically from the inputted settings with the configuration settings. Below is an example of how the settings should look.
```
"cpus-per-task": 8,
"memory": "32G",
"job-name": "SummaActors",
"account": "Account Name",
"numHRUs": 517315,
"maxNumberOfJobs": 517,
"maxGRUsPerSubmission": 1000,
"executablePath": "/Path/To/SUMMA"
```
SUMMA-Actors is still a work in progress so the specific settings for running jobs optimally may depend on the specifics of your workflow. The testing for SUMMA-Actors has been done with the North American Domain Dataset. For this running jobs of 1000 HRUs with 8 CPUs and 32 GB of Ram has been the most optimal. You are welcome to experiment and find what works best for you.
NOTE: For optimal scheduler performance it is recommended to use 1 core per 4GB of RAM. So if you wanted to use 4 CPUs you should only use 16 GB of RAM. This is not a requirement and more of a guideline. Running SUMMA-Actors with less RAM could result in issues if the number of HRUs per Submission is not adjusted accordingly as well as some other settings in the `SummaActor` settings. See the `general guidelines` section for more information.
* SummaActor
The SummaActor object contains key:value pairs necessary for controlling how large the output structure is as well as how many GRUs run at one time within a job. These are necessary values to control the RAM the program uses. When you submit a job of 1000 GRUs and the `maxGRUPerJob` value is set to 500, SUMMA-Actor will only run 500 GRUs at one time. It will essentially complete all 1000 GRUs in two separate batches, computing one batch of 500 then another batch. Each batch will create their own output file. Below is an example of how these settings should look.
```
"OuputStructureSize": 250, # How many timesteps of data the output structure can hold at once before needing to write the data to a file.
"maxGRUPerJob": 500
```
* FileAccessActor
In order to speed up the write speed of SUMMA-Actors it is necessary to write output in large chunks. The setting in the FileAccessActor controls how many chunks the output structure is divided into. A good rule of thumb is to use as many vectors as there are CPUs, this will ensure optimal performance. Below is an example of what these settings should look like.
```
"num_vectors_in_output_manager": 8
```
* JobActor
These settings control the path to the fileManager.txt file. This is set up automatically by the configuration script. You may also print output as csv. These are specific HRU timings of how long each HRU spent in certain parts of the program. The path will be set up automatically upon running the configuration script. The only setting that needs to be changed is the boolean value to control if output is desired. Below is an example of how these settings should look.
```
"FileManagerPath": "", # SET AUTOMATICALLY
"outputCSV": true,
"csvPath": "" # SET AUTOMATICALLY
```
* HRUActor
These settings control how often if at all the HRU will display output. The frequency is per timestep. If this value is set very low it will slow down the execution of the program. A good value is anything over 10000 for production runs. Below is an example for how these settings should look.
```
"printOutput": true,
"outputFrequency": 50000
```
#### General Guidelines ####
For the optimal performance and the best chance of jobs succeeding it is recommend to use the following settings:
```
"cpus-per-task": 8,
"memory": "32G",
"maxGRUsPerSubmission": 1000,
"maxGRUPerJob": 500
"num_vectors_in_output_manager": 8
"outputFrequency": 50000
```
Feel free to experiment with different values, although you may encounter out of memory errors with your jobs. The amount of RAM used is very dependent on which HRUs are being computing. If you want to use
## Credits
The inital implementation of SUMMA is credited to the inital publications below. These
publications can be found in [Water Resources Research](http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1944-7973).
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment