h2osteam¶
h2osteam¶
-
class
h2osteam.
AutoDocConfig
(template_path=None, template_sections_path=None, sub_template_type=None, main_template_type='docx', float_format='{:6.4g}', data_summary_feat_num=- 1, num_features=20, plot_num_features=20, min_relative_importance=0, stats_quantiles=20, psi_quantiles=10, response_rate_quantiles=10, pdp_feature_list=None, mli_frame=None, ice_frame=None, num_ice_rows=0, cardinality_limit=25, pdp_out_of_range=3, pdp_num_bins=10, warning_shift_auc_threshold=0.8, include_hist=True, use_shapley=True, **kwargs)¶ This class allows you to set the AutoDoc’s advance configurations. Note this class does not require that you set any parameters (i.e., you can run config = AutoDoc()).
- Parameters
template_path – str, optional: Path to general or custom template. Defaults to None.
template_sections_path – str, optional: Path to general or custom template sections. Defaults to None.
sub_template_type – str, optional: The document type (e.g., ‘docx’ or ‘md’). Defaults to the main_template_type value.
main_template_type – str, optional: The subtemplate type (e.g., ‘docx’ or ‘md’). Defaults to ‘docx’.
float_format – str: Format string syntax. Defaults to “{:6.4g}”: total width of 6 with 4 digits after the decimal place, using ‘g’ general format.
data_summary_feat_num – int: Number of features to show in data summary. Value must be an integer. Values lower than 1, e.g., 0 or -1, indicate that.
num_features – int: The number of top features to display in the document tables. Defaults to 20.
plot_num_features – The number of top features to display in the document plots. Defaults to 20.
min_relative_importance – The minimum relative importance in order for a feature to be displayed in the feature importance table/plot. Defaults to 0.
stats_quantiles – int: The number of quantiles to use for prediction statistics computation. Defaults to 20.
psi_quantiles – int: The number of quantiles to use for population stability index computation. Defaults to 10.
response_rate_quantiles – int: The number of quantiles to use for response rates information computation. Defaults to 10.
pdp_feature_list – list: A list of feature names (str) for which to create partial dependence plots.
mli_frame – H2OFrame: An H2OFrame on which the partial dependence and Shapley values will be calculated. If no H2OFrame is specified the training frame is used. Defaults to None.
ice_frame – H2OFrame, optional: An H2OFrame on which the individual conditional expectation will be calculated. If no H2OFrame is specified then ice rows will be selected automatically.
num_ice_rows – int, optional: The number of rows to be automatically selected for independent conditional expectation from train data. This argument is ignored if ice_frame argument is provided.
cardinality_limit – int: The maximum number of categorical levels a feature can have, above which the partial dependence plot will not be generated. Defaults to 25.
use_hdfs – bool: Whether to save the document to HDFS. Requires that H2O or Sparkling Water cluster has access to HDFS. Defaults to False.
pdp_out_of_range – int: The number of standard deviations, outside of the range of a column, to include in partial dependence plots. This shows how the model will react to data it has not seen before. Defaults to 3.
pdp_num_bins – int: The number of bins for the partial dependence plot. Defaults to 10.
warning_shift_auc_threshold – float: The threshold for which a warning will be shown, if the auc is greater than or equal to this threshold. Defaults to 0.08.
use_shapley – bool: Whether to calculate Shapley values, for algorithms where it is available. Note Shapley value calculations may take a long time for very wide datasets. Defaults to False.
-
serialize
()¶
-
class
h2osteam.
SteamClient
(conn=None)¶ DEPRECATED! This class and its methods are deprecated and they will be removed in v1.8
-
create_pyspark_python_path_environment
(name, path)¶ DEPRECATED! Create Python Pyspark Path environment.
-
delete_python_environment
(environment_id)¶ DEPRECATED! Delete Python environment.
-
static
get_h2o_cluster
(cluster_name)¶ DEPRECATED! Get H2O cluster by name.
-
get_h2o_clusters
()¶ DEPRECATED! Get H2O clusters.
-
get_python_environments
()¶ DEPRECATED! Get Python environments.
-
static
get_sparkling_cluster
(cluster_name)¶ DEPRECATED! Get Sparkling Water cluster by name.
-
get_sparkling_clusters
()¶ DEPRECATED! Get Sparkling Water clusters.
-
static
show_profiles
()¶ DEPRECATED! Prints profiles available to this user.
-
static
start_external_sparkling_cluster
(cluster_name=None, profile_name=None, h2o_version=None, driver_cores=0, driver_memory_gb=0, num_executors=0, executor_cores=0, executor_memory_gb=0, h2o_nodes=0, h2o_node_memory_gb=0, h2o_node_threads=0, start_timeout_sec=0, yarn_queue=None, python_environment_name='', spark_properties=None)¶ DEPRECATED! Launch Sparkling Water external backend cluster.
-
static
start_h2o_cluster
(cluster_name=None, profile_name=None, num_nodes=0, node_memory=None, v_cores=0, n_threads=0, max_idle_time=0, max_uptime=0, extramempercent=10, h2o_version=None, yarn_queue=None, callback_ip=None, node_id=0)¶ DEPRECATED! Launch a new H2O cluster.
-
static
start_internal_sparkling_cluster
(cluster_name=None, profile_name=None, h2o_version=None, driver_cores=0, driver_memory_gb=0, num_executors=0, executor_cores=0, executor_memory_gb=0, h2o_node_threads=0, start_timeout_sec=0, yarn_queue=None, python_environment_name='', spark_properties=None)¶ DEPRECATED! Launch Sparkling Water internal backend cluster.
-
static
stop_h2o_cluster
(config)¶ DEPRECATED! Stop H2O cluster.
-
static
upload_conda_environment
(name, path)¶ DEPRECATED! Upload Conda Python environments.
-
static
upload_engine
(path)¶ DEPRECATED! Upload H2O engine.
-
static
upload_sparkling_engine
(path)¶ DEPRECATED! Upload Sparkling Water engine.
-
-
h2osteam.
api
()¶ Get direct access to the Steam API for expert users only.
Expert users can bypass the clients for each product and access the Steam API directly. This use-case is not supported and not recommended! If possible use the provided clients!
- Examples
>>> import h2osteam >>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True) >>> api = h2osteam.api() >>> api
-
h2osteam.
login
(url=None, username=None, password=None, verify_ssl=True, cacert=None, ca_cert=None)¶ Connect to an existing Enterprise Server server.
There are two ways to pass password to a server: either pass a server parameter containing an instance of an H2OLocalServer, or specify ip and port of the server that you want to connect to.
- Parameters
url – Full URL (including schema and port) of the Steam server to connect to. Must use https schema.
username – Username of the connecting user.
password – Password or user access token of the connecting user.
verify_ssl – Setting this to False will disable SSL certificates verification.
cacert – (Optional) Path to a CA bundle file or a directory with certificates of trusted CAs.
ca_cert – (DEPRECATED) Path to a CA bundle file or a directory with certificates of trusted CAs.
- Examples
>>> import h2osteam >>> url = "https://steam.example.com:9555" >>> username = "AzureDiamond" >>> password = "hunter2" >>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
-
h2osteam.
print_profiles
()¶ Prints profiles available to this user.
Prints details about the profiles available to the logged-in user.
- Examples
>>> import h2osteam >>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True) >>> h2osteam.print_profiles() >>> # Profile name: default-h2o >>> # Profile type: h2o >>> # Number of nodes: MIN=1 MAX=10 >>> # Node memory [GB]: MIN=1 MAX=30 >>> # Threads per node: MIN=0 MAX=0 >>> # Extra memory [%]: MIN=10 MAX=50 >>> # Max idle time [hrs]: MIN=1 MAX=24 >>> # Max uptime [hrs]: MIN=1 MAX=24 >>> # YARN virtual cores: MIN=0 MAX=0 >>> # YARN queues:
-
h2osteam.
print_python_environments
()¶ Prints Sparkling Water Python environments available to this user.
Prints details about Sparkling Water Python environments available to the logged-in user.
- Examples
>>> import h2osteam >>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True) >>> h2osteam.print_python_environments() >>> # Name: Python 2.7 default >>> # Python Pyspark Path: >>> # Conda Pack path: lib/conda-pack/python-27-default.tar.gz >>> # === >>> # Name: Python 3.7 default >>> # Python Pyspark Path: >>> # Conda Pack path: lib/conda-pack/python-37-default.tar.gz