H2O AutoDoc

Version 1.0.3 (build 301)

  • The H2O AutoDoc generates an automatic model report for supported supervised learning models from H2O-3 and Scikit-Learn.
    Please contact us, if you need help with installing this software.

Licensing

  • The AutoDoc requires a commercial license. If you do not have a commercial license, please contact H2O.ai.

Installation Instructions

Download on Linux (x86-64) Download on MacOS X

1. Prerequisite: Python 3.6.x or 3.7.x

2. Install dependencies:

  • Pandoc
  • How to install Pandoc Ubuntu:

wget https://github.com/jgm/pandoc/releases/download/2.9.1.1/pandoc-2.9.1.1-1-amd64.deb
dpkg -i pandoc*.deb

3. Pip install the h2o_autodoc module for your Python version

Python3.6

pip install https://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/h2o_autodoc/1.0.3-301/dist/h2o_autodoc-1.0.3-cp36-cp36m-linux_x86_64.whl

Python3.7

pip install https://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/h2o_autodoc/1.0.3-301/dist/h2o_autodoc-1.0.3-cp37-cp37m-linux_x86_64.whl

How to generate an H2O AutoDoc from an H2O-3 Model

  • If you don't have H2O-3 installed yet, run the following code in your terminal to pip install h2o-3 or see more detailed installation instructions here.

1. Create an H2O model

Run the following code in your interactive Python session or Jupyter notebook:

# import h2o and initialize h2o cluster
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
h2o.init()

# import datasets for training and validation
train_path = "https://s3.amazonaws.com/h2o-training/events/ibm_index/CreditCard_Cat-train.csv"
valid_path = "https://s3.amazonaws.com/h2o-training/events/ibm_index/CreditCard_Cat-test.csv"

# import the train and valid dataset
train = h2o.import_file(train_path, destination_frame='CreditCard_Cat-train.csv')
valid = h2o.import_file(valid_path, destination_frame='CreditCard_Cat-test.csv')

# set predictors and response
predictors = train.columns
predictors.remove('ID')
response = "DEFAULT_PAYMENT_NEXT_MONTH"

# convert target to factor
train[response] = train[response].asfactor()
valid[response] = valid[response].asfactor()

# build an H2O-3 GBM Model
model = H2OGradientBoostingEstimator(model_id="gbm_model", seed=1234)
model.train(x = predictors, y = response, training_frame = train, validation_frame = valid)

4. Create H2O AutoDoc Model Report

  • For the following example, you will need to update the output_file_path and license_file variables with file paths specific to your environment.

# Parameters the User Must Set: output_file_path and license_file
# specify your license file location and where to save your H2O AutoDoc Report
license_file = "full/path/to/your/license.sig"
output_file_path = "full/path/to/your/autodoc/report_H2O3.docx"

# H2O AutoDoc package imports
from h2o_autodoc import Config
from h2o_autodoc import render_autodoc

# set your H2O AutoDoc configurations config = Config(output_path=output_file_path, license_file=license_file)

# generate an H2O AutDoc report for your model
render_autodoc(h2o, config, model)