Running Submarine on YARN
Submarine for YARN supports TensorFlow, PyTorch and MXNet framework. (Which is leveraging TonY created by Linkedin to run deep learning training jobs on YARN.
Submarine also supports GPU-on-YARN and Docker-on-YARN feature.
Submarine can run on Hadoop 2.7.3 or later version, if GPU-on-YARN or Docker-on-YARN feature is needed, newer Hadoop version is required, please refer to the next section about what Hadoop version to choose.
Hadoop version
Must:
- Apache Hadoop version newer than 2.7.3
Optional:
- When you want to use GPU-on-YARN feature with Submarine, please make sure Hadoop is at least 2.10.0+ (or 3.1.0+), and follow Enable GPU on YARN 2.10.0+ to enable GPU-on-YARN feature.
- When you want to run training jobs with Docker container, please make sure Hadoop is at least 2.8.2, and follow Enable Docker on YARN 2.8.2+ to enable Docker-on-YARN feature.
Submarine YARN Runtime Guide
YARN Runtime Guide talk about how to use Submarine to run jobs on YARN, with Docker / without Docker.