How to run on the AWS DP HPC cluster interactively

The Rapthor Prefect prototype can be run on a single node on AWS using the example aws-interactive.sh shell script provided. This script will load dependencies available via spack and install additional python dependencies (via poetry) in a virtual environment. It will then start a local prefect server and run the code in poc.py. The script has been tested on the AWS DP HPC cluster using an interactive compute node and 8 CPUs.

Prerequisites

  • An account on the AWS DP HPC cluster

  • Optional: SSH tunnel to the compute node you will be using to forward port 4200 (for the Prefect UI) and port 8787 (for the Dask dashboard) to your local machine.

Steps

  1. Log into the DP HPC headnode. Information about the DP HPC cluster, including how to gain access is available from this confluence page.

  2. Change to a directory where you want to clone the project.

  3. Clone the project repository using git.

    git clone https://gitlab.com/ska-telescope/sdp/ska-sdp-rapthor-prefect-prototype.git
    
  4. Change to the project directory.

    cd ska-sdp-rapthor-prefect-prototype
    
  5. Optionally check out the branch of the repo that you want to test:

    git checkout <feature-branch>
    
  6. Start an interactive compute node.

    srun --nodes=1 --partition=any-r7i-24xl-spt --cpus-per-task=8 --ntasks-per-node=1 --time=1:00:00 --pty bash -i
    
  7. If you want to use an SSH tunnel to access the Prefect UI and Dask dashboard on your local machine, follow these additional steps:

    1. Get the hostname of the compute node and use this to set up port forwarding from your local machine to the compute node. Copy the compute node hostname after running the command below.

      echo $HOSTNAME
      
    2. On your local machine, set up an SSH tunnel to the compute node to forward port 4200 for the Prefect UI and port 8787 for the Dask dashboard.

      ssh -L 127.0.0.1:4200:<compute-node-hostname>:4200 -L 127.0.0.1:8787:<compute-node-hostname>:8787 dp-hpc-headnode
      
  8. Edit the aws-interactive.sh script to set the correct path to the project directory. It will use the current directory by default.

  9. Run the aws-interactive.sh script. This will install the project and dependencies via poetry, check the installation, and then run the start.sh script to launch the pipeline. The output of the script will be saved in a log file.

    ./aws-interactive.sh | tee rpp-aws-`date +"%FT%T"`.log
    
  10. The Prefect server will start automatically. The code in poc.py will run automatically when the script starts. If you have set up the SSH tunnel, once the Prefect server is up, you can monitor the progress of the flow in the Prefect UI at http://localhost:4200 and the Dask dashboard at http://localhost:8787.

  11. When the flow has completed, you can stop the prefect server by pressing CTRL-C in the terminal where you ran the script.