Clients

Admin client

admin_client

class h2osteam.clients.admin.admin_client.AdminClient
static import_h2o_engine(path)

Import H2O engine to Steam.

Imports H2O engine from Steam server and makes it available to users.

Parameters

path – Full path to the H2O engine on disk of the Steam server.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.import_h2o_engine("/tmp/h2o-3.26.0.6-cdh6.3.zip")
static import_python_environment(name, path)

Import an existing Python environment using the Python Pyspark Path.

Imports an existing Python environment to Steam using the path to the Python executable.

Parameters
  • name – Name of the new Python environment.

  • path – Full path to the python executable of the new Python environment.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.create_pyspark_python_path_environment("python3", "/tmp/virtual-env/python3/bin/python")
static import_sparkling_engine(path)

Import Sparkling Water engine to Steam.

Imports Sparkling Water engine from Steam server and makes it available to users.

Parameters

path – Full path to the Sparkling Water engine on disk of the Steam server.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.import_sparkling_engine("/tmp/sparkling-water-3.28.0.1-1-2.4.zip")
static upload_conda_environment(name, path)

Upload Conda Python environment.

Uploads and imports an existing Python environment using a path to a conda-packed Conda Python environment.

Parameters
  • name – Name of the new Python environment.

  • path – Full path to the conda-packed Conda Python environment.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.upload_conda_environment("python3-conda", "/tmp/conda-python3.tar.gz")
static upload_h2o_engine(path)

Upload H2O engine to Steam.

Uploads H2O engine from local machine to the Steam server where it is imported and made available to users.

Parameters

path – Full path to the H2O engine on disk of the local machine.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.upload_h2o_engine("/tmp/h2o-3.26.0.6-cdh6.3.zip")
static upload_sparkling_engine(path)

Upload Sparkling Water engine to Steam.

Uploads Sparkling Water engine from local machine to the Steam server where it is imported and made available to users.

Parameters

path – Full path to the Sparkling Water engine on disk of the local machine.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.upload_sparkling_engine("/tmp/sparkling-water-3.28.0.1-1-2.4.zip")

Driverless AI Client

dai_client

dai_instance

H2O-3 Client

h2o_client

class h2osteam.clients.h2o.h2o_client.H2oClient
static get_cluster(name)

Get an existing H2O cluster.

Parameters

name – Name of the cluster.

Returns

H2O cluster as an H2oCluster object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> H2oClient.get_cluster("test-cluster")
static get_clusters()

Get all H2O clusters available to this user.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> H2oClient.get_clusters()
static launch_cluster(name=None, version=None, profile_name=None, nodes=None, node_cpus=None, yarn_vcores=None, node_memory_gb=None, extra_memory_percent=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, yarn_queue='', leader_node_id=0)

Launch a new H2O cluster.

Launches a new H2O cluster using the parameters described below. You do not need to specify all parameters. In that case they will be filled based on the default values of the selected profile. The process of launching a cluster can take up to 5 minutes.

Parameters
  • name – Name of the new cluster.

  • version – Version of H2O that will be used in the cluster.

  • profile_name – (Optional) Specify name of an existing profile that will be used for this cluster.

  • nodes – (Optional) Number of nodes of the H2O cluster.

  • node_cpus – (Optional) Number of CPUs/threads used by H2O on a single node. Specify ‘0’ to use all available CPUs/threads.

  • yarn_vcores – (Optional) Number of YARN virtual cores per cluster node. Should match node_cpus.

  • node_memory_gb – (Optional) Amount of memory in GB allocated for a single H2O node.

  • extra_memory_percent – (Optional) Percentage of extra memory that will be allocated outside of H2O JVM for algos like XGBoost.

  • max_idle_h – (Optional) Maximum amount of time in hours the cluster can be idle before shutting down.

  • max_uptime_h – (Optional) Maximum amount of time in hours the cluster will be up before shutting down.

  • timeout_s – (Optional) Maximum amount of time in seconds to wait for the H2O cluster to start.

  • yarn_queue – (Optional) Name of the YARN queue where the cluster will be placed.

  • leader_node_id – (Optional) ID of the H2O leader node.

Returns

H2O cluster as an H2oCluster object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> H2oClient.launch_cluster(name="test-cluster",
>>>                          version="3.28.0.2",
>>>                          nodes=4,
>>>                          node_memory_gb=10)

h2o_cluster

class h2osteam.clients.h2o.h2o_cluster.H2oCluster(cluster_id=None)
connect()

Connect to the H2O cluster using the H2O Python client.

Examples

>>> import h2o
>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.connect()
delete()

Delete stopped H2O cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.delete()
download_autodoc(model=None, config=None, path=None, train_frame=None, valid_frame=None, test_frame=None, alternative_models=[])

Generates and downloads the AutoDoc report.

Parameters
  • model – H2OModel: the H2O model object for which the ml-autodoc will render a report document.

  • config – h2osteam.utils.AutoDocConfig: the configuration settings for download_autodoc.

  • path – str: Path to where the generated AutoDoc report will be downloaded to.

  • train_frame – H2OFrame: the training frame used to build the H2O model.

  • valid_frame – H2OFrame: the validation frame used to build the H2O model.

  • test_frame – H2OFrame: additional test dataset (not used during training) which contains the same feature names found in the training dataset.

  • alternative_models – list: a list of H2OModels. These are models that exist in the current running H2O cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.download_autodoc(model=titanic_glm, path="/tmp/report", test_frame="test")
download_autodoc_logs(path=None)

Download AutoDoc logs.

Parameters

path – Path where the AutoDoc logs will be saved.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.download_autodoc_logs("/tmp/test-cluster-logs")
download_logs(path=None)

Download logs of the H2O cluster.

Parameters

path – Path where the H2O logs will be saved.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.download_logs("/tmp/test-cluster-logs")
get_config()

Get connection config of the H2O cluster.

Get connection config of the H2O cluster that can be used as a parameter to h2o.connect. Use only if H2oCluster.connect() does not work for you.

Examples

>>> import h2o
>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> h2o.connect(config=cluster.get_config())
save_autodoc(model=None, config=None, path=None, train_frame=None, valid_frame=None, test_frame=None, alternative_models=[])

Generates AutoDoc report.

Parameters
  • model – H2OModel: the H2O model object for which the ml-autodoc will render a report document.

  • config – h2osteam.utils.AutoDocConfig: the configuration settings for save_autodoc.

  • path – str: Path to where the generated AutoDoc report will be saved to.

  • train_frame – H2OFrame: the training frame used to build the H2O model.

  • valid_frame – H2OFrame: the validation frame used to build the H2O model.

  • test_frame – H2OFrame: additional test dataset (not used during training) which contains the same feature names found in the training dataset.

  • alternative_models – list: a list of H2OModels. These are models that exist in the current running H2O cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.save_autodoc(model="titanic_glm", path="/tmp/report", test_frame="test")
status()

Get status of the H2O cluster.

Returns

H2O cluster status as a string.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.status()
>>> # running
stop()

Stop a running H2O cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.stop()
wait()

Wait for H2O cluster to finish launching.

Sparkling Water Client

sparkling_client

class h2osteam.clients.sparkling.sparkling_client.SparklingClient
static get_cluster(name=None)

Get an existing Sparkling Water cluster.

Parameters

name – Name of the cluster.

Returns

Sparkling Water cluster as an SparklingSession object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> SparklingClient.get_cluster("test-cluster")
static get_clusters()

Get all Sparkling Water clusters available to this user.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> SparklingClient.get_clusters()
static launch_sparkling_cluster(name=None, version=None, profile_name=None, python_environment_name=None, driver_cores=None, driver_memory_gb=None, executors=None, executor_cores=None, executor_memory_gb=None, h2o_nodes=None, h2o_node_memory_gb=None, h2o_node_cpus=None, h2o_extra_memory_percent=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, yarn_queue='', spark_properties=None)

Launch a new Sparkling Water cluster.

Launches a new Sparkling Water cluster using the parameters described below. You do not need to specify all parameters. In that case they will be filled based on the default value of the selected profile. The process of launching a cluster can take up to 5 minutes.

Parameters
  • name – Name of the cluster.

  • version – Version of Sparkling Water that will be used in the cluster.

  • profile_name – (Optional) Name of the profile for the cluster.

  • python_environment_name – (Optional) Specify the Python environment name you want to use.

  • driver_cores – (Optional) Number of Spark driver cores.

  • driver_memory_gb – (Optional) Amount of Spark driver memory in GB.

  • executors – (Optional) Number of Spark executors.

  • executor_cores – (Optional) Number of Spark executor cores.

  • executor_memory_gb – (Optional) Amount of Spark executor memory in GB.

  • h2o_nodes – (Optional) Specify the number of H2O nodes for the cluster.

  • h2o_node_memory_gb – (Optional) Specify the amount of memory that should be available on each H2O node.

  • h2o_node_cpus – (Optional) Number of CPUs/threads used by H2O on a single node. Specify ‘0’ to use all available CPUs/threads.

  • h2o_extra_memory_percent – (Optional) Specify the amount of extra memory for internal JVM use outside of the Java heap.

  • max_idle_h – (Optional) Maximum amount of time in hours the cluster can be idle before shutting down.

  • max_uptime_h – (Optional) Maximum amount of time in hours the cluster will be up before shutting down.

  • timeout_s – (Optional) Maximum amount of time in seconds to wait for the H2O cluster to start.

  • spark_properties – (Optional) Specify additional spark properties as a Python dictionary.

  • yarn_queue – (Optional) Name of the YARN queue where the cluster will be placed.

Returns

Sparkling cluster as an SparklingSession object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> SparklingClient.launch_sparkling_cluster(name="test-cluster",
>>>                                          version="3.28.0.2",
>>>                                          executors=4,
>>>                                          executor_memory_gb=10)

sparkling_cluster

class h2osteam.clients.sparkling.sparkling_cluster.SparklingSession(cluster)
delete()

Delete stopped Sparkling Water cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.delete()
download_autodoc(model=None, config=None, path=None, train_frame=None, valid_frame=None, test_frame=None, alternative_models=[])

Generates and downloads the AutoDoc report.

Parameters
  • model – H2OModel: the H2O model object for which the ml-autodoc will render a report document.

  • config – h2osteam.utils.AutoDocConfig: the configuration settings for download_autodoc.

  • path – str: Path to where the generated AutoDoc report will be downloaded to.

  • train_frame – H2OFrame: the training frame used to build the H2O model.

  • valid_frame – H2OFrame: the validation frame used to build the H2O model.

  • test_frame – H2OFrame: additional test dataset (not used during training) which contains the same feature names found in the training dataset.

  • alternative_models – list: a list of H2OModels. These are models that exist in the current running H2O cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.download_autodoc(model=titanic_glm, path="/tmp/report", test_frame="test")
download_autodoc_logs(path=None)

Download AutoDoc logs.

Parameters

path – Path where the AutoDoc logs will be saved.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.download_autodoc_logs("/tmp/test-cluster-logs")
download_logs(path=None)

Download logs of the Sparkling cluster.

Parameters

path – Path where the Sparkling logs will be saved.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.download_logs("/tmp/test-cluster-logs")
save_autodoc(model=None, config=None, path=None, train_frame=None, valid_frame=None, test_frame=None, alternative_models=[])

Generates AutoDoc report.

Parameters
  • model – H2OModel: the H2O model object for which the ml-autodoc will render a report document.

  • config – h2osteam.utils.AutoDocConfig: the configuration settings for save_autodoc.

  • path – str: Path to where the generated AutoDoc report will be saved to.

  • train_frame – H2OFrame: the training frame used to build the H2O model.

  • valid_frame – H2OFrame: the validation frame used to build the H2O model.

  • test_frame – H2OFrame: additional test dataset (not used during training) which contains the same feature names found in the training dataset.

  • alternative_models – list: a list of H2OModels. These are models that exist in the current running H2O cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.save_autodoc(model=titanic_glm, path="/tmp/report", test_frame="test")
send_statement(statement=None)

Send a single statement to the remote spark session.

Parameters

statement – A string representation of statement for the Spark session.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.send_statement("f_crimes = h2o.import_file(path ="../data/chicagoCrimes10k.csv",col_types =column_type)")
session()

Connect to the remote Spark session and issue commands.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.session()
status()

Get status of the Sparkling Water cluster.

Returns

Sparkling Water cluster status as a string.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.status()
>>> # running
stop()

Stop a running Sparkling Water cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.stop()
wait()

Wait for the Sparkling Water cluster to finish launching.

class h2osteam.clients.sparkling.sparkling_cluster.SparklingShell(session)
onecmd(s)

Interpret the argument as though it had been typed in response to the prompt.

This may be overridden, but should not normally need to be; see the precmd() and postcmd() methods for useful execution hooks. The return value is a flag indicating whether interpretation of commands by the interpreter should stop.

postloop()

Hook method executed once when the cmdloop() method is about to return.

preloop()

Hook method executed once when the cmdloop() method is called.