Apptainer image configuration for the Costs and Benefits module
Apptainer
Apptainer is a container technology that simplifies the creation and execution of containers. Apptainers represents an alternative to Docker in scientific computing.
We are going to use Apptainer in order to create a image for the Costs and Benefits module.
The Apptainer installations steps can be found in the next URL:
Download and interact with pre-built images
We can download pre-built images from repositories like:
- https://hub.docker.com
- https://quay.io
Docker Hub download example :
apptainer pull docker://alpine
Quay download example :
apptainer pull docker://quay.io/jitesoft/alpine
Image configuration from sandbox
The image configuration steps are the next
The steps to configure the image using the sandbox format are as follows:
-
- Build the image in sandbox format:
apptainer build --sandbox <URI> | <imagen>
-
- Request a shell in the sandbox directory:
apptainer shell --writable <DIR>
-
- Make the required configurations.
-
- Convert the sandbox to SIF format:
apptainer build imagen.sif <DIR>
Costs and Benefits module
The Costs and benefits repository contains data for 26 LAC countries.
The 26 LAC countries (ISO CODE 3) are the next: ARG, BHS, BRB, BLZ, BOL, BRA, CHL, COL, CRI, DOM, ECU, SLV, GTM, GUY, HTI, HND, JAM, MEX, NIC, PAN, PRY, PER, SUR, TTO, URY, VEN
To scale the analysis to more countries, it is necessary to add information to the following files:
AGRC_LVST_productivity_cost_gdp.csv
ENTC_REDUCE_LOSSES_cost_file.xlsx
(sheet name Annual Loss Reduction Cost).LNDU_soil_carbon_fractions.csv
We will use K-means to construct groups with the objective of imputing the existing values to the rest of the countries.
Building the image for the Costs and Benefits module execution
1) Build the image in sandbox format
We will configure the image to run SISEPUEDE from a pre-built Ubuntu 22.04.4 LTS image from Docker Hub:
apptainer pull docker://ubuntu:22.04
We will use the sandbox format to make changes to the container image. Next we will build the image defining steps in a definition file.
The downloaded image is immutable. Since we must to do configuration changes, we will create a sandbox from the image:
apptainer build --sandbox cb_module ubuntu_22.04.sif
2) Request a shell in the sandbox directory
We request a shell in the container generated by Apptainer from the sandbox format:
apptainer shell -pfw --no-mount home cb_module
The flags used mean the following:
- -w, --writable : by default all Apptainer containers are available as read only. This option makes the file system accessible as read/write.
- -f, --fakeroot : run container with the appearance of running as root
- -p, --pid : run container in a new PID namespace
Inside the sandbox we have to update packages:
Apptainer> apt update
We add
Apptainer>
at the beginning of the prompt In order to make explicit that we are working in a new shell within the container and we can interact with it as though it were a virtual machine.
Python 3.11 installation in the sandbox:
Apptainer> apt install -y python3.11 python3.11-venv python3-venv python3-dev python3-pip
Create the file requirements.txt
for the Python packages installation:
Apptainer> echo "
pandas==2.2.1
scikit-learn
openpyxl
munch==2.5.0
PyYAML==6.0
geopy==2.1.0
SQLAlchemy==2.0.29
julia==0.6.2
" > requirements.txt
Apptainer> pip install -r requirements.txt
Install git to download the Costs and Benefits repository:
Apptainer> apt install -y git
Apptainer> cd /opt
Apptainer> git clone https://github.com/nidiot/sisepuede_costs_benefits.git
Clone the SISEPUEDE repository:
Apptainer> git clone https://github.com/jcsyme/sisepuede.git
Create a directory where the country file will be read in ISO Code 3:
Apptainer> mkdir -p /opt/sisepuede_data/Energy/nemomod_entc_residual_capacity_pp_gas_gw/raw_data
Apptainer> cd /opt/sisepuede_data/Energy/nemomod_entc_residual_capacity_pp_gas_gw/raw_data
Apptainer> apt install -y wget
Apptainer> wget https://raw.githubusercontent.com/milocortes/sisepuede_data/main/Energy/nemomod_entc_residual_capacity_pp_gas_gw/raw_data/iso3_all_countries.csv
Apptainer> cd /opt
Install R and packages needed:
Apptainer> apt install -y r-base r-base-dev
Apptainer> LC_ALL=C.UTF-8 R -e 'install.packages(c("purrr", "stringr", "tidyr", "data.table", "readxl", "dplyr", "reshape2", "lhs", "reshape"))'
Download the Python program that impute data for the rest of countries with K-Means
Apptainer> wget https://raw.githubusercontent.com/milocortes/sisepuede_data/main/utils/actualiza_datos_cb.py
Apptainer> python3 actualiza_datos_cb.py
We override some of the program lines in the Costs and Benefits repository, particularly those related to local routes:
Apptainer> mkdir -p /opt/ssp
Apptainer> mkdir -p /opt/cb
Apptainer> FILE_CB="/opt/sisepuede_costs_benefits/Main/cb_calculate_costs_and_benefits_script.R"
Apptainer> sed -i '6s/.*/setwd("\/opt\/sisepuede_costs_benefits\/Main\/")/' $FILE_CB
Apptainer> sed -i '13s/.*/path_to_model_results<-"\/opt\/cb\/"/' $FILE_CB
Apptainer> sed -i '14s/.*/path_to_ssp_results<-"\/opt\/ssp\/"/' $FILE_CB
Apptainer> sed -i '15s/.*/data_filename<-paste0(path_to_ssp_results,/' $FILE_CB
Apptainer> sed -i '16s/.*/ list.files(path=path_to_ssp_results, /' $FILE_CB
Apptainer> sed -i '17s/.*/ pattern = glob2rx("sisepuede_results_sisepuede_run_*"))) #path to model output runs/' $FILE_CB
Apptainer> sed -i '22s/.*/primary_filename<-paste0(path_to_ssp_results, "ATTRIBUTE_PRIMARY.csv") #path to model output primary filename/' $FILE_CB
Apptainer> sed -i '23s/.*/strategy_filename<-paste0(path_to_ssp_results, "ATTRIBUTE_STRATEGY.csv") #path to model output strategy filename/' $FILE_CB
Apptainer> CB_CONFIG_FILE="/opt/sisepuede_costs_benefits/Main/cb_config.R"
Apptainer> sed -i '19s/.*/sisepuede_data_git_path<-"\/opt\/sisepuede_data\/"/' $CB_CONFIG_FILE
Apptainer> sed -i '20s/.*/ssp_costs_benefits_git_path<-"\/opt\/sisepuede_costs_benefits\/"/' $CB_CONFIG_FILE
Apptainer> sed -i '195,$d' $FILE_CB
Execute bash file that fix some csv files:
Apptainer> cd /opt/sisepuede_costs_benefits/cost_factors
Apptainer> bash append_newlines_to_csvs.sh
Apptainer> cd /opt/sisepuede_costs_benefits/strategy_specific_cb_files
Apptainer> bash append_newlines_to_csvs.sh
Apptainer> cd /opt
Execute Costs and Benefits module
In order to execute the Costs and Benefits module, we need data from the SISEPUEDE model to be processed.
Since the sandbox is viewed for our operating system like another directory in the system, we can move files between the host system and the sandbox using commands like cp
, mv
, rsync
, etc.
Suppose that we have the zip file ssp_armenia.zip
that contains the outputs of the SISEPUEDE model:
unzip -l ssp_armenia.zip
Archive: ssp_armenia.zip
Length Date Time Name
--------- ---------- ----- ----
6428502 2024-04-30 19:52 sisepuede_results_sisepuede_run_armenia.csv
114 2024-04-30 19:46 ATTRIBUTE_PRIMARY.csv
691 2024-04-30 19:41 ANALYSIS_METADATA.csv
4012586 2024-04-30 19:41 MODEL_BASE_INPUT_DATABASE.csv
3333377 2024-04-30 19:46 MODEL_OUTPUT.csv
3223195 2024-04-30 19:46 MODEL_INPUT.csv
460 2024-04-30 19:41 ATTRIBUTE_DESIGN.csv
16745 2024-04-30 19:41 ATTRIBUTE_STRATEGY.csv
--------- -------
17015670 8 files
The ssp_armenia.zip
file can be downloaded from the URL:
We must copy this file to the directory /opt/ssp
inside the sandbox:
ls
total 33104
-rw-r--r-- 1 milo milo 3081 May 1 14:34 cb_model.def
drwxr-xr-x 18 milo milo 4096 Apr 28 20:31 cb_module
-rw-r--r-- 1 milo milo 4077591 May 1 14:34 ssp_armenia.zip
-rwxr-xr-x 1 milo milo 29810688 May 1 14:36 ubuntu_22.04.sif
cp ssp_armenia.zip cb_module/opt/ssp
Access to the sandbox again:
apptainer shell -pfw --no-mount home cb_module
And you will see the ssp_armenia.zip
file in the directory /opt/ssp
:
Apptainer> ls /opt/ssp
ssp_armenia.zip
Unzip the ssp_armenia.zip
in the directory /opt/ssp
:
Apptainer> apt install unzip
Apptainer> unzip /opt/ssp/ssp_armenia.zip -d /opt/ssp
Archive: /opt/ssp/ssp_armenia.zip
inflating: /opt/ssp/sisepuede_results_sisepuede_run_armenia.csv
inflating: /opt/ssp/ATTRIBUTE_PRIMARY.csv
inflating: /opt/ssp/ANALYSIS_METADATA.csv
inflating: /opt/ssp/MODEL_BASE_INPUT_DATABASE.csv
inflating: /opt/ssp/MODEL_OUTPUT.csv
inflating: /opt/ssp/MODEL_INPUT.csv
inflating: /opt/ssp/ATTRIBUTE_DESIGN.csv
inflating: /opt/ssp/ATTRIBUTE_STRATEGY.csv
So far we have already the environment for the Costs and Benefits module execution. Run the next command for execute the Costs and Benefits main program:
Apptainer> FILE_CB="/opt/sisepuede_costs_benefits/Main/cb_calculate_costs_and_benefits_script.R"
Apptainer> LC_ALL=C.UTF-8 Rscript $FILE_CB
The outputs of the execution are in the directory /opt/cb
:
Apptainer> apt install tree
Apptainer> tree /opt/cb
/opt/cb
|-- cost_benefit_results.csv
|-- economy_wide_cost_benefit_results.csv
|-- net_benefit_net_ghg.csv
`-- sisepuede_results_TRIMMED_LONG.csv
0 directories, 4 files
Creating Apptainer image from Definition File
For a reproducible, verifiable and production-quality container, the Apptainer documentation recommends that you build a SIF file using an Apptainer definition file. The Apptainer definition file can be thinked like a Dockerfile that contains a script of instructions.
Create the definition file cb_model.def
with the content:
Bootstrap : docker
From: ubuntu:22.04
%post
apt update
DEBIAN_FRONTEND=noninteractive TZ=America/Mexico_City apt -y install tzdata
apt install -y python3.11 python3.11-venv python3-venv python3-dev python3-pip
apt install -y git
cd /opt
git clone https://github.com/nidiot/sisepuede_costs_benefits.git
git clone https://github.com/jcsyme/sisepuede.git
mkdir -p /opt/sisepuede_data/Energy/nemomod_entc_residual_capacity_pp_gas_gw/raw_data
cd /opt/sisepuede_data/Energy/nemomod_entc_residual_capacity_pp_gas_gw/raw_data
apt install -y wget
wget https://raw.githubusercontent.com/milocortes/sisepuede_data/main/Energy/nemomod_entc_residual_capacity_pp_gas_gw/raw_data/iso3_all_countries.csv
cd /opt
apt install -y r-base r-base-dev
LC_ALL=C.UTF-8 R -e 'install.packages(c("purrr", "stringr", "tidyr", "data.table", "readxl", "dplyr", "reshape2", "lhs", "reshape"))'
wget https://raw.githubusercontent.com/milocortes/sisepuede_data/main/utils/actualiza_datos_cb.py
echo "
pandas==2.2.1
scikit-learn
openpyxl
munch==2.5.0
PyYAML==6.0
geopy==2.1.0
SQLAlchemy==2.0.29
julia==0.6.2
" > requirements.txt
pip3 install -r requirements.txt
python3 actualiza_datos_cb.py
mkdir -p /opt/ssp
mkdir -p /opt/cb
FILE_CB="/opt/sisepuede_costs_benefits/Main/cb_calculate_costs_and_benefits_script.R"
sed -i '6s/.*/setwd("\/opt\/sisepuede_costs_benefits\/Main\/")/' $FILE_CB
sed -i '13s/.*/path_to_model_results<-"\/opt\/cb\/"/' $FILE_CB
sed -i '14s/.*/path_to_ssp_results<-"\/opt\/ssp\/"/' $FILE_CB
sed -i '15s/.*/data_filename<-paste0(path_to_ssp_results,/' $FILE_CB
sed -i '16s/.*/ list.files(path=path_to_ssp_results, /' $FILE_CB
sed -i '17s/.*/ pattern = glob2rx("sisepuede_results_sisepuede_run_*"))) #path to model output runs/' $FILE_CB
sed -i '22s/.*/primary_filename<-paste0(path_to_ssp_results, "ATTRIBUTE_PRIMARY.csv") #path to model output primary filename/' $FILE_CB
sed -i '23s/.*/strategy_filename<-paste0(path_to_ssp_results, "ATTRIBUTE_STRATEGY.csv") #path to model output strategy filename/' $FILE_CB
CB_CONFIG_FILE="/opt/sisepuede_costs_benefits/Main/cb_config.R"
sed -i '19s/.*/sisepuede_data_git_path<-"\/opt\/sisepuede_data\/"/' $CB_CONFIG_FILE
sed -i '20s/.*/ssp_costs_benefits_git_path<-"\/opt\/sisepuede_costs_benefits\/"/' $CB_CONFIG_FILE
sed -i '195,$d' $FILE_CB
cd /opt/sisepuede_costs_benefits/cost_factors
bash append_newlines_to_csvs.sh
cd /opt/sisepuede_costs_benefits/strategy_specific_cb_files
bash append_newlines_to_csvs.sh
cd /opt
apt install unzip zip
echo "
#!/bin/bash
cp \$1 /opt/ssp
cd /opt/ssp
unzip *
LC_ALL=C.UTF-8 Rscript $FILE_CB
" > ejecuta-cb
%environment
FILE_CB="/opt/sisepuede_costs_benefits/Main/cb_calculate_costs_and_benefits_script.R"
%runscript
bash /opt/ejecuta-cb $*
country=$2
zip "cb_${country}.zip" -r -j /opt/cb
Build the SIF
apptainer build cb_cl.sif cb_model.def
We can create a symbolic link for cb_cl.sif
in order to be executed like any other system executable:
SIF_PATH="$(pwd)/cb_cl.sif"
sudo ln -sv $SIF_PATH /usr/local/bin/cb_cl
We must to configure some Apptainer environment variables for the SIF execution:
export APPTAINER_WRITABLE_TMPFS="true"
Execute the SIF for the zip file ssp_armenia.zip
that contains the outputs of the SISEPUEDE model:
./cb_cl.sif ssp_armenia.zip armenia
The execution return a zip file cb_armenia.zip
with the SISEPUEDE model output on it.