Model

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

water
Class Model

java.lang.Object
  water.Iced
      water.Model

All Implemented Interfaces:: java.lang.Cloneable, Freezable

Direct Known Subclasses:: DTree.TreeModel, GLMModel, KMeans2.KMeans2Model, NeuralNet.NeuralNetModel

public abstract class Model
extends Iced
extends Iced

A Model models reality (hopefully). A model can be used to 'score' a row, or a collection of rows on any compatible dataset - meaning the row has all the columns with the same names as used to build the mode.

Nested Class Summary
`protected static class`	`Model.SB`

Field Summary
`Key`	`_dataKey` Dataset key used to build the model, for models for which this makes sense, or null otherwise.
`java.lang.String[][]`	`_domains` Categorical/factor/enum mappings, per column.
`java.lang.String[]`	`_names` Columns used in the model and are used to match up with scoring data columns.
`Key`	`_selfKey` Key associated with this Model, if any.
`static DocGen.FieldDoc[]`	`DOC_FIELDS`

Constructor Summary
`Model(Key selfKey, Key dataKey, Frame fr)` Full constructor from frame: Strips out the Vecs to just the names needed to match columns later for future datasets.
`Model(Key selfKey, Key dataKey, java.lang.String[] names, java.lang.String[][] domains)` Full constructor
`Model(Key selfKey, Model m)` Simple shallow copy constructor to a new Key

Method Summary
`Frame[]`	`adapt(Frame fr, boolean exact)` Build an adapted Frame from the given Frame.
`java.lang.String[]`	`classNames()`
`ConfusionMatrix`	`cm()` For classifiers, confusion matrix on validation set.
`void`	`delete()` Called when deleting this model, to cleanup any internal keys
`static int[]`	`getDomainMapping(java.lang.String colName, java.lang.String[] modelDom, java.lang.String[] dom, boolean exact)` Returns a mapping between values domains for a given column.
`boolean`	`isClassifier()`
`int`	`nclasses()`
`java.lang.String`	`responseName()`
`double`	`score(double[] data)`
`Frame`	`score(Frame fr, boolean exact)` Bulk score the frame 'fr', producing a Frame result; the 1st Vec is the predicted class, the remaining Vecs are the probability distributions.
`float[]`	`score(Frame fr, boolean exact, int row)` Single row scoring, on a compatible Frame.
`float[]`	`score(int[][] map, double[] row, float[] preds)` Single row scoring, on a compatible set of data, given an adaption vector
`float[]`	`score(java.lang.String[] names, java.lang.String[][] domains, boolean exact, double[] row)` Single row scoring, on a compatible set of data.
`protected float[]`	`score0(Chunk[] chks, int row_in_chunk, double[] tmp, float[] preds)` Bulk scoring API for one row.
`protected abstract float[]`	`score0(double[] data, float[] preds)` Subclasses implement the scoring logic.
`void`	`testJavaScoring(Frame fr)`
`java.lang.String`	`toJava()` Return a String which is a valid Java program representing a class that implements the Model.
`protected void`	`toJavaInit(javassist.CtClass ct)`
`protected void`	`toJavaInit(Model.SB sb)`
`protected void`	`toJavaPredictBody(Model.SB sb)`

Methods inherited from class water.Iced
`clone, frozenType, init, newInstance, read, toDocField, write, writeJSON, writeJSONFields`

Methods inherited from class java.lang.Object
`equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

DOC_FIELDS

public static DocGen.FieldDoc[] DOC_FIELDS

_selfKey

public final Key _selfKey

Key associated with this Model, if any.

_dataKey

public final Key _dataKey

Dataset key used to *build* the model, for models for which this makes sense, or null otherwise. Not all models are built from a dataset (eg artificial models), or are built from a single dataset (various ensemble models), so this key has no *mathematical* significance in the model but is handy during common model-building and for the historical record.

_names

public final java.lang.String[] _names

Columns used in the model and are used to match up with scoring data columns. The last name is the response column name.

_domains

public final java.lang.String[][] _domains

Categorical/factor/enum mappings, per column. Null for non-enum cols. The last column holds the response col enums.

Constructor Detail

Model

public Model(Key selfKey,
             Key dataKey,
             Frame fr)

Full constructor from frame: Strips out the Vecs to just the names needed to match columns later for future datasets.

Model

public Model(Key selfKey,
             Key dataKey,
             java.lang.String[] names,
             java.lang.String[][] domains)

Full constructor

Model

public Model(Key selfKey,
             Model m)

Simple shallow copy constructor to a new Key

Method Detail

delete

public void delete()

Called when deleting this model, to cleanup any internal keys

responseName

public java.lang.String responseName()

classNames

public java.lang.String[] classNames()

isClassifier

public boolean isClassifier()

nclasses

public int nclasses()

cm

public ConfusionMatrix cm()

For classifiers, confusion matrix on validation set.

score

public Frame score(Frame fr,
                   boolean exact)

Bulk score the frame 'fr', producing a Frame result; the 1st Vec is the predicted class, the remaining Vecs are the probability distributions. For Regression (single-class) models, the 1st and only Vec is the prediction value. Also passed in a flag describing how hard we try to adapt the frame.

score

public final float[] score(Frame fr,
                           boolean exact,
                           int row)

Single row scoring, on a compatible Frame.

score

public final float[] score(java.lang.String[] names,
                           java.lang.String[][] domains,
                           boolean exact,
                           double[] row)

Single row scoring, on a compatible set of data. Fairly expensive to adapt.

score

public final float[] score(int[][] map,
                           double[] row,
                           float[] preds)

Single row scoring, on a compatible set of data, given an adaption vector

adapt

public Frame[] adapt(Frame fr,
                     boolean exact)

Build an adapted Frame from the given Frame. Useful for efficient bulk scoring of a new dataset to an existing model. Same adaption as above, but expressed as a Frame instead of as an int[][]. The returned Frame does not have a response column. It returns a two element array containing an adapted frame and a frame which contains only vectors which where adapted (the purpose of the second frame is to delete all adapted vectors with deletion of the frame).

getDomainMapping

public static int[] getDomainMapping(java.lang.String colName,
                                     java.lang.String[] modelDom,
                                     java.lang.String[] dom,
                                     boolean exact)

Returns a mapping between values domains for a given column.

score0

protected float[] score0(Chunk[] chks,
                         int row_in_chunk,
                         double[] tmp,
                         float[] preds)

Bulk scoring API for one row. Chunks are all compatible with the model, and expect the last Chunks are for the final distribution & prediction. Default method is to just load the data into the tmp array, then call subclass scoring logic.

score0

protected abstract float[] score0(double[] data,
                                  float[] preds)

Subclasses implement the scoring logic. The data is pre-loaded into a re-used temp array, in the order the model expects. The predictions are loaded into the re-used temp array, which is also returned.

score

public double score(double[] data)

toJava

public java.lang.String toJava()

Return a String which is a valid Java program representing a class that implements the Model. The Java is of the form:

    class UUIDxxxxModel {
      public static final String NAMES[] = { ....column names... }
      public static final String DOMAINS[][] = { ....domain names... }
      // Pass in data in a double[], pre-aligned to the Model's requirements.
      // Jam predictions into the preds[] array; preds[0] is reserved for the
      // main prediction (class for classifiers or value for regression),
      // and remaining columns hold a probability distribution for classifiers.
      float[] predict( double data[], float preds[] );
      double[] map( HashMap row, double data[] );
      // Does the mapping lookup for every row, no allocation
      float[] predict( HashMap row, double data[], float preds[] );
      // Allocates a double[] for every row
      float[] predict( HashMap row, float preds[] );
      // Allocates a double[] and a float[] for every row
      float[] predict( HashMap row );
    }

toJavaInit

protected void toJavaInit(Model.SB sb)

toJavaInit

protected void toJavaInit(javassist.CtClass ct)

toJavaPredictBody

protected void toJavaPredictBody(Model.SB sb)

testJavaScoring

public void testJavaScoring(Frame fr)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

water Class Model

DOC_FIELDS

_selfKey

_dataKey

_names

_domains

Model

Model

Model

delete

responseName

classNames

isClassifier

nclasses

cm

score

score

score

score

adapt

getDomainMapping

score0

score0

score

toJava

toJavaInit

toJavaInit

toJavaPredictBody

testJavaScoring

water
Class Model