- The H2O AutoDoc generates an automatic model report for supported supervised learning models from H2O-3 and Scikit-Learn.
Please contact us, if you need help with installing this software.
Licensing
- The AutoDoc requires a commercial license. If you do not have a commercial license, please contact H2O.ai.
Installation Instructions
1. Prerequisite: Python 3.6.x, 3.7.x or 3.8.x
2. Install dependencies:
- Pandoc
- How to install Pandoc Ubuntu:
wget https://github.com/jgm/pandoc/releases/download/2.9.1.1/pandoc-2.9.1.1-1-amd64.deb
dpkg -i pandoc*.deb
3. Pip install the h2o_autodoc module for your Python version
pip install https://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/h2o_autodoc/1.0.4-5/dist/h2o_autodoc-1.0.4-cp36-cp36m-linux_x86_64.whl
pip install https://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/h2o_autodoc/1.0.4-5/dist/h2o_autodoc-1.0.4-cp37-cp37m-linux_x86_64.whl
pip install https://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/h2o_autodoc/1.0.4-5/dist/h2o_autodoc-1.0.4-cp38-cp38-linux_x86_64.whl
How to generate an H2O AutoDoc from an H2O-3 Model
- If you don't have H2O-3 installed yet, run the following code in your terminal to pip install h2o-3 or
see more detailed installation instructions here.
1. Create an H2O model
Run the following code in your interactive Python session or Jupyter notebook:
# import h2o and initialize h2o cluster
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
h2o.init()
# import datasets for training and validation
train_path = "https://s3.amazonaws.com/h2o-training/events/ibm_index/CreditCard_Cat-train.csv"
valid_path = "https://s3.amazonaws.com/h2o-training/events/ibm_index/CreditCard_Cat-test.csv"
# import the train and valid dataset
train = h2o.import_file(train_path, destination_frame='CreditCard_Cat-train.csv')
valid = h2o.import_file(valid_path, destination_frame='CreditCard_Cat-test.csv')
# set predictors and response
predictors = train.columns
predictors.remove('ID')
response = "DEFAULT_PAYMENT_NEXT_MONTH"
# convert target to factor
train[response] = train[response].asfactor()
valid[response] = valid[response].asfactor()
# build an H2O-3 GBM Model
model = H2OGradientBoostingEstimator(model_id="gbm_model", seed=1234)
model.train(x = predictors, y = response, training_frame = train, validation_frame = valid)
4. Create H2O AutoDoc Model Report
- For the following example, you will need to update the output_file_path and license_file variables with file paths specific to your environment.
# Parameters the User Must Set: output_file_path and license_file
# specify your license file location and where to save your H2O AutoDoc Report
license_file = "full/path/to/your/license.sig"
output_file_path = "full/path/to/your/autodoc/report_H2O3.docx"
# H2O AutoDoc package imports
from h2o_autodoc import Config
from h2o_autodoc import render_autodoc
# set your H2O AutoDoc configurations
config = Config(output_path=output_file_path, license_file=license_file)
# generate an H2O AutDoc report for your model
render_autodoc(h2o, config, model)