Driverless AI MOJO Scoring Pipeline - Python 및 R 래퍼를 사용한 C++ 런타임¶

C++ 스코어링 파이프라인은 protobuf 기반 MOJO2 프로토콜을 위한 R 및 Python 패키지로 제공됩니다. 패키지는 독립형이므로 추가 소프트웨어가 필요 없습니다. MOJO Scoring Pipeline이 빌드된 다음 선호하는 방법을 사용하십시오.

Notes:

이 스코어링 파이프라인은 현재 RuleFit 모델에 사용할 수 없습니다
MOJO Scoring Pipeline이 비활성화된 경우, Download MOJO Scoring Pipeline 버튼은 Build MOJO Scoring Pipeline 로 표시됩니다.
Reduce MOJO Size 상세 설정을 활성화하면, 실험을 빌드하는 동안 Driverless AI가 MOJO Scoring Pipeline의 크기 줄이기를 시도합니다. see 를 참조하십시오.

스코어링 파이프라인 런타임 다운로드¶

R 및 Python 패키지는 Driverless AI 애플리케이션 내에서 다운로드할 수 있습니다. 다운로드하려면 Resources 를 클릭한 다음 드롭 다운 메뉴에서 MOJO2 R Runtime 및 MOJO2 Py Runtime 을 클릭합니다. 팝업 메뉴가 나타나면, 사용 중인 OS에 해당하는 버튼을 클릭합니다. Linux 또는 IBM PowerPC를 선택합니다.

예제¶

다음 예시는 C++ MOJO 런타임의 R 및 Python API의 사용 방법을 보여줍니다.

전제 조건

Linux OS (x86 또는 PPC)
Driverless AI 라이선스(파일 또는 환경 변수)
Rcpp (>=1.0.0)
data.table

Running the MOJO2 R Runtime

# Install the R MOJO runtime using one of the methods below

# Install the R MOJO runtime on PPC Linux
install.packages("./daimojo_2.5.15_ppc64le-linux.tar.gz")

# Install the R MOJO runtime on x86 Linux
install.packages("./daimojo_2.5.15_x86_64-linux.tar.gz")

# Load the MOJO
library(daimojo)
m <- load.mojo("./mojo-pipeline/pipeline.mojo")

# retrieve the creation time of the MOJO
create.time(m)
## [1] "2019-11-18 22:00:24 UTC"

# retrieve the UUID of the experiment
uuid(m)
## [1] "65875c15-943a-4bc0-a162-b8984fe8e50d"

# Load data and make predictions
col_class <- setNames(feature.types(m), feature.names(m))  # column names and types

library(data.table)
d <- fread("./mojo-pipeline/example.csv", colClasses=col_class, header=TRUE, sep=",")

predict(m, d)
##       label.B    label.M
## 1  0.08287659 0.91712341
## 2  0.77655075 0.22344925
## 3  0.58438434 0.41561566
## 4  0.10570505 0.89429495
## 5  0.01685609 0.98314391
## 6  0.23656610 0.76343390
## 7  0.17410333 0.82589667
## 8  0.10157948 0.89842052
## 9  0.13546191 0.86453809
## 10 0.94778244 0.05221756

전제 조건

Linux OS (x86 또는 PPC)
Driverless AI 라이선스(파일 또는 환경 변수)
Python 3.6

datatable. 다음을 실행하여 설치합니다.

# Install on Linux PPC or Linux x86
pip install datatable

protobuf의 넌바이너리 버전:

pip install --no-binary=protobuf protobuf

Python MOJO 런타임. GUI에서 다운로드한 후 다음 명령 중 하나를 실행하십시오.

# Install the MOJO runtime on Linux PPC
pip install daimojo-2.5.15-cp36-cp36m-linux_ppc64le.whl

# Install the MOJO runtime on Linux x86
pip install daimojo-2.5.15-cp36-cp36m-linux_x86_64.whl

Running the MOJO2 Python Runtime

# import the daimojo model package
import daimojo.model

# specify the location of the MOJO
m = daimojo.model("./mojo-pipeline/pipeline.mojo")

# retrieve the creation time of the MOJO
m.created_time
# 'Mon November 18 14:00:24 2019'

# retrieve the UUID of the experiment
m.uuid

# retrieve a list of missing values
m.missing_values
# ['',
#  '?',
#  'None',
#  'nan',
#  'NA',
#  'N/A',
#  'unknown',
#  'inf',
#  '-inf',
#  '1.7976931348623157e+308',
#  '-1.7976931348623157e+308']

# retrieve the feature names
m.feature_names
# ['clump_thickness',
#  'uniformity_cell_size',
#  'uniformity_cell_shape',
#  'marginal_adhesion',
#  'single_epithelial_cell_size',
#  'bare_nuclei',
#  'bland_chromatin',
#  'normal_nucleoli',
#  'mitoses']

# retrieve the feature types
m.feature_types
# ['float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32']

# retrieve the output names
m.output_names
# ['label.B', 'label.M']

# retrieve the output types
m.output_types
# ['float64', 'float64']

# import the datatable module
import datatable as dt

# parse the example.csv file
pydt = dt.fread("./mojo-pipeline/example.csv", na_strings=m.missing_values, header=True, sep=',')
pydt
#     clump_thickness  uniformity_cell_size  uniformity_cell_shape  marginal_adhesion  single_epithelial_cell_size  bare_nuclei  bland_chromatin  normal_nucleoli  mitoses
# 0                 8                     1                      3                 10                            6            6                9                1        1
# 1                 2                     1                      2                  2                            5            3                4                8        8
# 2                 1                     1                      1                  9                            4           10                3                5        4
# 3                 2                     6                      9                 10                            4            8                1                1        3
# 4                10                    10                      8                  1                            8            3                6                3        4
# 5                 1                     8                      4                  5                           10            1                2                5        3
# 6                 2                    10                      2                  9                            1            2                9                3        8
# 7                 2                     8                      9                  2                           10           10                3                5        4
# 8                 6                     3                      8                  5                            2            3                5                3        4
# 9                 4                     2                      2                  8                            1            2                8                9        1

# [10 rows × 9 columns]

# retrieve the column types
pydt.stypes
# (stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64)

# make predictions on the example.csv file
res = m.predict(pydt)

# retrieve the predictions
res
#           label.B     label.M
# 0     0.0828766       0.917123
# 1     0.776551        0.223449
# 2     0.584384        0.415616
# 3     0.105705        0.894295
# 4     0.0168561       0.983144
# 5     0.236566        0.763434
# 6     0.174103        0.825897
# 7     0.101579        0.898421
# 8     0.135462        0.864538
# 9     0.947782        0.0522176

# [10 rows × 2 columns]

# retrieve the prediction column names
res.names
#     ('label.B', 'label.M')

# retrieve the prediction column types
res.stypes
# (stype.float64, stype.float64)

# convert datatable results to common data types
# res.to_pandas()  # need pandas
# res.to_numpy()   # need numpy
res.to_list()