Submarine Local Deployment

Prerequisite#

Deploy Kubernetes Cluster#

$ minikube start --vm-driver=docker --cpus 8 --memory 4096 --disk-size=20G --kubernetes-version v1.15.11

Install Submarine on Kubernetes#

$ git clone https://github.com/apache/submarine.git
$ cd submarine
$ helm install submarine ./helm-charts/submarine
NAME: submarine
LAST DEPLOYED: Fri Jan 29 05:35:36 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

Verify installation#

Once you got it installed, check with below commands and you should see similar outputs:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
notebook-controller-deployment-5db8b6cbf7-k65jm 1/1 Running 0 5s
pytorch-operator-7ff5d96d59-gx7f5 1/1 Running 0 5s
submarine-database-8d95d74f7-ntvqp 1/1 Running 0 5s
submarine-server-b6cd4787b-7bvr7 1/1 Running 0 5s
submarine-traefik-9bb6f8577-66sx6 1/1 Running 0 5s
tf-job-operator-7844656dd-lfgmd 1/1 Running 0 5s
warning

Note that if you encounter below issue when installation:

Error: rendered manifests contain a resource that already exists.
Unable to continue with install: existing resource conflict: namespace: , name: podgroups.scheduling.incubator.k8s.io, existing_kind: apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition, new_kind: apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition

It might be caused by the previous installed submarine charts. Fix it by running:

$ kubectl delete crd/tfjobs.kubeflow.org && kubectl delete crd/podgroups.scheduling.incubator.k8s.io && kubectl delete crd/pytorchjobs.kubeflow.org

Use Port Forwarding to Access Submarine in a Cluster#

# # Listen on port 32080 on all addresses, forwarding to 80 in the pod
$ kubectl port-forward --address 0.0.0.0 service/submarine-traefik 32080:80

Open Workbench in the browser.#

Open http://127.0.0.1:32080. The default username and password is admin and admin

Uninstall Submarine#

$ helm delete submarine