Driverless AI MOJO Scoring Pipeline - C++ Runtime with Python and R Wrappers¶
The C++ Scoring Pipeline is provided as R and Python packages for the protobuf-based MOJO2 protocol. The packages are self contained, so no additional software is required. Simply build the MOJO Scoring Pipeline and begin using your preferred method.
Notes:
- These scoring pipelines are currently not available for RuleFit models.
- The Download MOJO Scoring Pipeline button appears as Build MOJO Scoring Pipeline if the MOJO Scoring Pipeline is disabled.
Downloading the Scoring Pipeline Runtimes¶
Linux OS¶
The R and Python packages can be download from within the Driverless AI application. To do this, click Resources, then click MOJO2 R Runtime and MOJO2 Py Runtime from the drop-down menu.
Examples¶
The following examples show how to use the R and Python APIs of the C++ MOJO runtime.
R Example¶
Prerequisites¶
- Linux OS (x86 or PPC) or Mac OS X (10.9 or newer)
- Driverless AI License (either file or environment variable)
Rcpp(>=1.0.0)data.table
Running the MOJO2 R Runtime¶
# Install the R MOJO runtime using one of the methods below
# Install the R MOJO runtime on PPC Linux
install.packages("./daimojo_2.1.12_ppc64le-linux.tar.gz")
# Install the R MOJO runtime on x86 Linux
install.packages("./daimojo_2.1.12_x86_64-linux.tar.gz")
#Install the R MOJO runtime on Mac OS X
install.packages("./daimojo_2.1.12_x86_64-darwin.tar.gz")
# Load the MOJO
library(daimojo)
m <- load.mojo("./mojo-pipeline/pipeline.mojo")
# retrieve the creation time of the MOJO
create.time(m)
## [1] "2019-11-18 22:00:24 UTC"
# retrieve the UUID of the experiment
uuid(m)
## [1] "65875c15-943a-4bc0-a162-b8984fe8e50d"
# Load data and make predictions
col_class <- setNames(feature.types(m), feature.names(m)) # column names and types
library(data.table)
d <- fread("./mojo-pipeline/example.csv", colClasses=col_class)
predict(m, d)
## label.B label.M
## 1 0.08287659 0.91712341
## 2 0.77655075 0.22344925
## 3 0.58438434 0.41561566
## 4 0.10570505 0.89429495
## 5 0.01685609 0.98314391
## 6 0.23656610 0.76343390
## 7 0.17410333 0.82589667
## 8 0.10157948 0.89842052
## 9 0.13546191 0.86453809
## 10 0.94778244 0.05221756
Python Example¶
Prerequisites¶
Linux OS (x86 or PPC) or Mac OS X (10.9 or newer)
Driverless AI License (either file or environment variable)
Python 3.6
datatable. Run the following to install:
# Install on Linux PPC, Linux x86, or Mac OS X pip install datatable
Python MOJO runtime. Run one of the following commands after downloading from the GUI:
# Install the MOJO runtime on Linux PPC pip install daimojo-2.1.12+master.106-cp36-cp36m-linux_ppc64le.whl # Install the MOJO runtime on Linux x86 pip install daimojo-2.1.12+master.106-cp36-cp36m-linux_x86_64.whl # Install the MOJO runtime on Mac OS X pip install daimojo-2.1.12+master.106-cp36-cp36m-macosx_10_7_x86_64.whl
Running the MOJO2 Python Runtime¶
# import the daimojo model package
import daimojo.model
# specify the location of the MOJO
m = daimojo.model("./mojo-pipeline/pipeline.mojo")
# retrieve the creation time of the MOJO
m.created_time
# 'Mon November 18 14:00:24 2019'
# retrieve the UUID of the experiment
m.uuid
# retrieve a list of missing values
m.missing_values
# ['',
# '?',
# 'None',
# 'nan',
# 'NA',
# 'N/A',
# 'unknown',
# 'inf',
# '-inf',
# '1.7976931348623157e+308',
# '-1.7976931348623157e+308']
# retrieve the feature names
m.feature_names
# ['clump_thickness',
# 'uniformity_cell_size',
# 'uniformity_cell_shape',
# 'marginal_adhesion',
# 'single_epithelial_cell_size',
# 'bare_nuclei',
# 'bland_chromatin',
# 'normal_nucleoli',
# 'mitoses']
# retrieve the feature types
m.feature_types
# ['float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32']
# retrieve the output names
m.output_names
# ['label.B', 'label.M']
# retrieve the output types
m.output_types
# ['float64', 'float64']
# import the datatable module
import datatable as dt
# parse the example.csv file
pydt = dt.fread("./mojo-pipeline/example.csv", na_strings=m.missing_values)
pydt
# clump_thickness uniformity_cell_size uniformity_cell_shape marginal_adhesion single_epithelial_cell_size bare_nuclei bland_chromatin normal_nucleoli mitoses
# 0 8 1 3 10 6 6 9 1 1
# 1 2 1 2 2 5 3 4 8 8
# 2 1 1 1 9 4 10 3 5 4
# 3 2 6 9 10 4 8 1 1 3
# 4 10 10 8 1 8 3 6 3 4
# 5 1 8 4 5 10 1 2 5 3
# 6 2 10 2 9 1 2 9 3 8
# 7 2 8 9 2 10 10 3 5 4
# 8 6 3 8 5 2 3 5 3 4
# 9 4 2 2 8 1 2 8 9 1
# [10 rows × 9 columns]
# retrieve the column types
pydt.stypes
# (stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64)
# make predictions on the example.csv file
res = m.predict(pydt)
# retrieve the predictions
res
# label.B label.M
# 0 0.0828766 0.917123
# 1 0.776551 0.223449
# 2 0.584384 0.415616
# 3 0.105705 0.894295
# 4 0.0168561 0.983144
# 5 0.236566 0.763434
# 6 0.174103 0.825897
# 7 0.101579 0.898421
# 8 0.135462 0.864538
# 9 0.947782 0.0522176
# [10 rows × 2 columns]
# retrieve the prediction column names
res.names
# ('label.B', 'label.M')
# retrieve the prediction column types
res.stypes
# (stype.float64, stype.float64)
# convert datatable results to common data types
# res.to_pandas() # need pandas
# res.to_numpy() # need numpy
res.to_list()