Examples
========

This section provides a complete example for using the Enterprise Steam Python client.

Launching and connecting to H2O cluster
---------------------------------------

This examples shows how to login to Steam and launch H2O cluster with 4 nodes and 10GB of memory per node.
The H2O cluster is using H2O version 3.28.0.2 and profile called ``default-h2o`` and submitting to the default YARN queue.
All other H2O parameters are pre-filled according to the selected profile.
When the cluster is up we connect to it and start importing data.

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import H2oClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = H2oClient.launch_cluster(name="test-cluster",
                                       profile_name="default-h2o",
                                       version="3.28.0.2",
                                       nodes=4,
                                       node_memory_gb=10)
    cluster.connect()
    airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip"
    airlines_df = h2o.import_file(path=airlines)

Connecting to existing H2O cluster
----------------------------------

This example shows how to login to Steam and connect to existing H2O cluster called ``test-cluster`` and import data.

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import H2oClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = H2oClient.get_cluster("test-cluster")
    cluster.connect()
    airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip"
    airlines_df = h2o.import_file(path=airlines)

Launching and connecting to Sparkling Water cluster
---------------------------------------------------

This examples shows how to login to Steam and launch Sparkling Water cluster with 4 executors and 10GB of memory per executor.
The Sparking Water cluster is using Sparkling Water version 3.28.0.2 and profile called ``default-sparkling-internal`` and submitting to the ``default`` YARN queue.
Profile type dictates a cluster backend type. In this case the cluster is starting in the internal mode.
All other Sparkling Water parameters are pre-filled according to the selected profile.
When the cluster is up we can send statements to the remote Spark session to start importing data.

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import SparklingClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = SparklingClient.launch_sparkling_cluster(name="test-sparkling-cluster",
                                                       profile_name="default-sparkling-internal",
                                                       version="3.28.0.2",
                                                       executors=4,
                                                       executor_memory_gb=10,
                                                       yarn_queue="default")

    cluster.send_statement('airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip"')
    cluster.send_statement('airlines_df = h2o.import_file(path=airlines)')

Connecting to existing Sparkling Water cluster
----------------------------------------------

This example shows how to login to Steam and connect to existing Sparkling Water cluster called ``test-sparkling-cluster`` and import data.

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import SparklingClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = SparklingClient.get_cluster("test-sparkling-cluster")

    cluster.send_statement('airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip"')
    cluster.send_statement('airlines_df = h2o.import_file(path=airlines)')

Launching and connecting to Driverless AI instance
--------------------------------------------------

This example shows how to create instance of Driverless AI v1.8.4.1, connect to it and upload dataset.

.. code-block:: python

    import h2osteam
    from h2oai_client import Client
    from h2osteam.clients import DaiClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    instance = DaiClient.launch_instance(name="test-instance", version="1.8.4.1")
    client = instance.connect()

    train_path = '/data/Kaggle/CreditCard/CreditCard-train.csv'
    test_path = '/data/Kaggle/CreditCard/CreditCard-test.csv'

    train = client.create_dataset_sync(train_path)
    test = client.create_dataset_sync(test_path)

Connecting existing Driverless AI instance
------------------------------------------

This example shows how to connect to existing Driverless AI instance called ``test-instance`` and upload dataset.

.. code-block:: python

    import h2osteam
    from h2oai_client import Client
    from h2osteam.clients import DaiClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    instance = DaiClient.get_instance(name="test-instance")
    client = instance.connect()

    train_path = '/data/Kaggle/CreditCard/CreditCard-train.csv'
    test_path = '/data/Kaggle/CreditCard/CreditCard-test.csv'

    train = client.create_dataset_sync(train_path)
    test = client.create_dataset_sync(test_path)