This section will describe how to use PEcAn with remote machines. This will be split into three pieces. The first section will describe how to setup your system to allow you to execute remotely without any password prompts. The next section will describe any items to add to the pecan.xml and the config.php files to enable remote execution from both PEcAn on the command line and PEcAn on the web. Finally the final section will describe how to install sub pieces of PEcAn on machines, to allow those machines to run the models.
SSH to remote machines
PEcAn leverages of SSH to communicate with remote machines. PEcAn expects to be able to connect to the remote machines without any password prompts. This can be accomplished many different ways, two common methods are to use a public/private keypair, the other is to use tunnels. The web interface is build to leverage the latter.
Once setup you should be able to login to the remote machine without being asked for a password.
public/prvate keypairs
Before the first time, run scripts/sshkey.sh
This will create a
public/private keypair, and places the public key on the remote server.
machine authentcation
Some systems (e.g. BU cluster) use authentication at the machine-level rather than the user-level. In this case you will need to create a .rhosts file in your home directory on the remote machine and list the servers you want to connect from. See issue #428
ssh tunnels
This works especially well for servers that use two factor authentication. This method leverages of the ability of SSH to send multiple channels across the same encrypted connection (tunnel). You will setup the first connection, and all subsequent connections will use the same connection.
To automatically create the tunnels, you can add the following to your
~/.ssh/config
Host <hostname goes here>
ControlMaster auto
ControlPath /tmp/%r@%h:%p
You can also create the tunnel using the following command:
ssh -o ControlMaster=yes -o ControlPath=/tmp/mytunnel -l username host
Next create a single ssh connection to the host, any ssh connection afterwards will use the same connection and should not require a password.
Before running the PEcAn workflow, open (and leave open) a single ssh connection from the local machine to the remote machine.
Configuring PEcAn to execute remotely
To enable PEcAn to run remotely we will need to modify the pecan.xml
to specify what host to connect to, what user to connect as, and how
to connect. The web interface will create the pecan.xml
with the
appropriate entries.
config.php for PEcAn web interface
The config.php has a few variables that will control where the web
interface can run jobs, and how to run those jobs. These variables
are $hostlist
, $qsublist
, $qsuboptions
, and $SSHtunnel
. In
the near future $hostlist
, $qsublist
, $qsuboptions
will be
combined into a single list.
$SSHtunnel
: points to the script that creates an SSH tunnel.
The script is located in the web folder and the default value of
dirname(__FILE__) . DIRECTORY_SEPARATOR . "sshtunnel.sh";
most
likely will work.
$hostlist
: is an array with by default a single value, only
allowing jobs to run on the local server. Adding any other servers
to this list will show them in the pull down menu when selecting
machines, and will trigger the web page to be show to ask for a
username and password for the remote execution (make sure to use
HTTPS setup when asking for password to prevent it from being send
in the clear).
$qsublist
: is an array of hosts that require qsub to be used
when running the models. This list can include $fqdn
to indicate
that jobs on the local machine should use qsub to run the models.
$qsuboptions
: is an array that lists options for each machine.
Currently it support the following options (see also
PEcAn-Configuration)
array("geo.bu.edu" =>
array("qsub" => "qsub -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash",
"jobid" => "Your job ([0-9]+) .*",
"qstat" => "qstat -j @JOBID@ || echo DONE",
"job.sh" => "module load udunits R/R-3.0.0_gnu-4.4.6",
"models" => array("ED2" => "module load hdf5"))
In this list qsub
is the actual command line for qsub, jobid
is the text returned from qsub, qstat
is the command to check
to see if the job is finished. job.sh
and the value in models
are additional entries to add to the job.sh file generated to
run the model. This can be used to make sure modules are loaded
on the HPC cluster before running the actual model.
pecan.xml for PEcAn command line runs
To enable remote execution from the command line you will need to add the <host> tag to pecan.xml (under <run>). This will let PEcAn know it should run the model on the remote system. You will need to specify the <name> tag to specify the remote machine. You can add <user> to specify the user, or you can use <tunnel> to specify the location of the tunnel to be used.
You can also add <job.sh> to both <host> and <model> to add specific information to the job.sh file used to run the model.
Running PEcAn code for modules remotely
You can compile and install the model specific code pieces of PEcAn on the cluster easily without having to install the full code base of PEcAn (and all OS dependencies). All of the code pieces depend on PEcAn.utils to install this you can run the following on the cluster:
devtools::install_github("pecanproject/pecan", subdir = 'utils')
Next we need to install the model specific pieces, this is done almost the same:
devtools::install_github("pecanproject/pecan", subdir = 'models/ed')
This should install dependencies required. Following are some notes on how to install the model specifics on different HPC clusters.
geo.bu.edu
Following modules need to be loaded:
module load hdf5 udunits R/R-3.0.0_gnu-4.4.6
Next the following packages need to be installed, otherwise it will fall back on the older versions install site-library
install.packages(c('udunits2', 'lubridate'),
configure.args=c(udunits2='--with-udunits2-lib=/project/earth/packages/udunits-2.1.24/lib --with-udunits2-include=/project/earth/packages/udunits-2.1.24/include'),
repos='http://cran.us.r-project.org')
Finally to install support for both ED and SIPNET:
devtools::install_github("pecanproject/pecan", subdir = 'utils')
devtools::install_github("pecanproject/pecan", subdir = 'models/sipnet')
devtools::install_github("pecanproject/pecan", subdir = 'models/ed')