Score: POJO

One of value add of using H2O is the ability to code in any front end API but have the model exportable as a POJO (Plain Old Java Object). This allows the user the flexibility to take the model outside of H2O either to run standalone or integrating the Java Object into a platform like Hadoop’s Storm. The walkthrough below will detail the steps required via the command line to export a model object and use it to score using a sample class object.

The working example or unit test for scoring using the Java code is available on github.

Walk-through

Step 1

For a H2O instance sitting on localhost:54321 by default and GBM model with 50 trees that you want to export, run the command to grab the h2o-model jar file as well as the Java code for the example model GBM_a2647515ded07d5b710c82015a6842a9. It is recommended to create a new directory for each model.

$ mkdir GBM_a2647515ded07d5b710c82015a6842a9
$ cd GBM_a2647515ded07d5b710c82015a6842a9
$ curl http://localhost:54321/h2o-model.jar > h2o-model.jar
$ curl http://localhost:54321/2/GBMModelView.java?_modelKey=GBM_a2647515ded07d5b710c82015a6842a9 >
GBM_a2647515ded07d5b710c82015a6842a9.java

Step 2

Download from git the PredictCSV class object that will be used to compile the model object. The user can certainly write their own script; the one available in git is working example that is tested on all the builds at 0xdata. It will take four arguments:

- -header
specify argument if the input data set has headers
- -model
model key name used to score on the input file
- -input
the input data that will be scored
- -output
the resulting output csv file with all the s scores for each entry of the input data

Step 3

Next set up a java instance to compile model object with PredictCSV.java which should generate in this case over 50 trees class objects.

$ javac -cp h2o-model.jar -J-Xmx2g -J-XX:MaxPermSize=256m PredictCSV.java
GBM_a2647515ded07d5b710c82015a6842a9.java

Step 4

Finally feed in testing data to be scored by running the following command:

$ java -ea -cp h2o-model.jar -Xmx4g -XX:MaxPermSize=256m -XX:ReservedCodeCacheSize=256m
PredictCSV.java --header --model GBM_a2647515ded07d5b710c82015a6842a9 --input iris_test.csv
--output out_pojo.csv

More generic sample command:

$ java -ea -cp h2o-model.jar -Xmx4g -XX:MaxPermSize=256m -XX:ReservedCodeCacheSize=256m
PredictCSV.java --header --model <model key> --input <path to input data>
--output <path to output csv>