Version: 0.8.0

Experiment Client

class ExperimentClient()

Client of a submarine server that creates and manages experients and logs.

`create_experiment(experiment_spec) -> dict`

Create an experiment.

Param	Type	Description	Default Value
experiment_spec	Dict	Submarine experiment spec. More detailed information can be found at Experiment API	x

Returns

The detailed info about the submarine experiment.

Example

from submarine import *
client = ExperimentClient()
client.create_experiment({
  "meta": {
    "name": "tf-mnist-json",
    "namespace": "default",
    "framework": "TensorFlow",
    "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150",
    "envVars": {
      "ENV_1": "ENV1"
    }
  },
  "environment": {
    "image": "apache/submarine:tf-mnist-with-summaries-1.0"
  },
  "spec": {
    "Ps": {
      "replicas": 1,
      "resources": "cpu=1,memory=1024M"
    },
    "Worker": {
      "replicas": 1,
      "resources": "cpu=1,memory=1024M"
    }
  }
})

`patch_experiment(id, experiment_spec) -> dict`

Patch an experiment.

Param	Type	Description	Default Value
id	String	Submarine experiment id.	x
experiment_spec	Dict	Submarine experiment spec. More detailed information of Submarine experiment spec can be found at Experiment API.	x

Returns

The detailed info about the submarine experiment.

Example

client.patch_experiment("experiment_1626160071451_0008", {
  "meta": {
    "name": "tf-mnist-json",
    "namespace": "default",
    "framework": "TensorFlow",
    "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150",
    "envVars": {
      "ENV_1": "ENV1"
    }
  },
  "environment": {
    "image": "apache/submarine:tf-mnist-with-summaries-1.0"
  },
  "spec": {
    "Worker": {
      "replicas": 2,
      "resources": "cpu=1,memory=1024M"
    }
  }
})

`get_experiment(id) -> dict`

Get the experiment's detailed info by id.

Param	Type	Description	Default Value
id	String	Submarine experiment id.	x

Returns

The detailed info about the submarine experiment.

Example

experiment = client.get_experiment("experiment_1626160071451_0008")

`list_experiments(status) -> list[dict]`

List all experiment for the user.

Param	Type	Description	Default Value
status	Optional[str]	Accepted, Created, Running, Succeeded, Deleted.	None

Returns

List of submarine experiments.

Example

experiments = client.list_experiments()

`delete_experiment(id) -> dict`

Delete the submarine experiment.

Param	Type	Description	Default Value
id	String	Submarine experiment id.	x

Returns

The detailed info about the deleted submarine experiment.

Example

client.delete_experiment("experiment_1626160071451_0008")

`get_log(id, onlyMaster)`

Print training logs of all pod of the experiment. By default print all the logs of Pod.

Param	Type	Description	Default Value
id	String	Submarine experiment id.	x
onlyMaster	Optional[bool]	By default include pod log of "master" which might be Tensorflow PS/Chief or PyTorch master.	x

Return

The info of pod logs

Example

client.get_log("experiment_1626160071451_0009")

`list_log(status)`

List experiment log.

Param	Type	Description	Default Value
status	String	Accepted, Created, Running, Succeeded, Deleted.	x

Returns

List of submarine experiment logs.

Example

logs = client.list_log("Succeeded")

`wait_for_finish(id, polling_interval)`

Waits until the experiment is finished or failed.

Param	Type	Description	Default Value
id	String	Submarine experiment id.	x
polling_interval	Optional[int]	How many seconds between two polls for the status of the experiment.	10

Returns

Submarine experiment logs.

Example

logs = client.wait_for_finish("experiment_1626160071451_0009", 5)

class ExperimentClient()​

create_experiment(experiment_spec) -> dict​

patch_experiment(id, experiment_spec) -> dict​

get_experiment(id) -> dict​

list_experiments(status) -> list[dict]​

delete_experiment(id) -> dict​

get_log(id, onlyMaster)​

list_log(status)​

wait_for_finish(id, polling_interval)​

class ExperimentClient()

`create_experiment(experiment_spec) -> dict`

`patch_experiment(id, experiment_spec) -> dict`

`get_experiment(id) -> dict`

`list_experiments(status) -> list[dict]`

`delete_experiment(id) -> dict`

`get_log(id, onlyMaster)`

`list_log(status)`

`wait_for_finish(id, polling_interval)`