water
Class OldModel

java.lang.Object
  extended by water.Iced
      extended by water.OldModel
All Implemented Interfaces:
java.lang.Cloneable, Freezable
Direct Known Subclasses:
DGLM.GLMModel, DPCA.PCAModel, KMeansModel, RFModel

public abstract class OldModel
extends Iced

A Model models reality (hopefully). A model can be used to 'score' a row, or a collection of rows on any compatible dataset - meaning the row has all the columns with the same names as used to build the mode.


Field Summary
 Key _dataKey
          Dataset key used to *build* the model, for models for which this makes sense, or null otherwise.
 Key _selfKey
          Key associated with this OldModel, if any.
 ValueArray _va
          Columns used in the model.
static DocGen.FieldDoc[] DOC_FIELDS
           
 
Constructor Summary
OldModel()
          Empty constructor for deserialization
OldModel(Key key)
           
OldModel(Key key, int[] cols, Key dataKey)
          Default model, built from the selected columns of the given dataset.
OldModel(Key key, OldModel m)
          Simple shallow copy constructor
OldModel(Key key, java.lang.String[] colNames, java.lang.String[] classNames)
          Default artificial model, built from given column names.
OldModel(Key key, ValueArray va, Key dataKey)
          Artificial model.
 
Method Summary
 OldModel adapt(java.lang.String[] colNames)
          Adapt model for given columns.
 OldModel adapt(ValueArray ary)
          Adapt model for the given dataset.
 boolean columnFilter(ValueArray.Column C)
           
 int[] columnMapping(java.lang.String[] names)
          Map from the model's columns to the given column names, or to -1 if no column name maps.
 void delete()
          Called when deleting this model, to cleanup any internal keys
 void fromJson(com.google.gson.JsonObject json)
           
static boolean identityMap(int[] mapping)
          Check that this is the identity map
static boolean isCompatible(int[] mapping)
          Check if this mapping is compatible.
 boolean isCompatible(java.lang.String[] names)
          Check if this model is compatible with this collection of column names.
 boolean isCompatible(ValueArray data)
          Check if this dataset is compatible with this model.
 ValueArray.Column response()
          Response column info
 java.lang.String responseName()
           
 double score(double[] data)
           
 double score(ValueArray ary, AutoBuffer bits, int rid)
           
protected abstract  double score0(double[] data)
           
protected  double score0(ValueArray data, AutoBuffer ab, int row_in_chunk)
          Bulk scoring API, on a compatible ValueArray (when pushed throw the mapping)
protected  double score0(ValueArray data, int row)
          Single row scoring, on a compatible ValueArray (when pushed throw the mapping)
 com.google.gson.JsonObject toJson()
           
 
Methods inherited from class water.Iced
clone, frozenType, init, newInstance, read, toDocField, write, writeJSON, writeJSONFields
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DOC_FIELDS

public static DocGen.FieldDoc[] DOC_FIELDS

_selfKey

public final Key _selfKey
Key associated with this OldModel, if any.


_va

public final ValueArray _va
Columns used in the model. No dataset needs to be mapped to this ValueArray, it is just used to control for valid column data. The mean & sigma are from the training dataset listed below, and are used when normalizing scoring data. The Column names are used to match up with scoring data columns. The last Column is the response column.


_dataKey

public final Key _dataKey
Dataset key used to *build* the model, for models for which this makes sense, or null otherwise. Not all models are built from a dataset (eg artificial models), or are built from a single dataset (various ensemble models), so this key has no *mathematical* significance in the model but is handy during common model-building and for the historical record.

Constructor Detail

OldModel

public OldModel()
Empty constructor for deserialization


OldModel

public OldModel(Key key)

OldModel

public OldModel(Key key,
                int[] cols,
                Key dataKey)
Default model, built from the selected columns of the given dataset. Data to be scored on the model has to have all the same columns (in any order, extra cols are ok). Last column is the response column, or -1 if there is no defined response column.


OldModel

public OldModel(Key key,
                java.lang.String[] colNames,
                java.lang.String[] classNames)
Default artificial model, built from given column names.


OldModel

public OldModel(Key key,
                ValueArray va,
                Key dataKey)
Artificial model. The 'va' defines the compatible data, but is not associated with any real dataset. Data to be scored on the model has to have all the same columns (in any order, extra cols are ok). The last column is the response column.


OldModel

public OldModel(Key key,
                OldModel m)
Simple shallow copy constructor

Method Detail

columnFilter

public boolean columnFilter(ValueArray.Column C)

delete

public void delete()
Called when deleting this model, to cleanup any internal keys


response

public final ValueArray.Column response()
Response column info


responseName

public final java.lang.String responseName()

columnMapping

public final int[] columnMapping(java.lang.String[] names)
Map from the model's columns to the given column names, or to -1 if no column name maps. Last entry is a mapping for the response Name. Return results range from -1 to number-of-columns in the dataset/names[] (which may be larger than the model).


identityMap

public static boolean identityMap(int[] mapping)
Check that this is the identity map


isCompatible

public static boolean isCompatible(int[] mapping)
Check if this mapping is compatible. Just means no -1 entries in the predictor variables (response column is not checked).


isCompatible

public final boolean isCompatible(java.lang.String[] names)
Check if this model is compatible with this collection of column names.


isCompatible

public final boolean isCompatible(ValueArray data)
Check if this dataset is compatible with this model. All the columns in the model have to be present, but extra columns may exist and the columns may be in a different order.


adapt

public OldModel adapt(ValueArray ary)
Adapt model for the given dataset. Default behavior is to map columns and categoricals to their original indexes. Categorical values we have not seen when building the model are translated as NaN. Override this to get custom adapt behavior (eg. handle unseen cats differently).

Parameters:
ary - - tst dataset
Returns:
OldModel - model adapted to be applied on the given data

adapt

public OldModel adapt(java.lang.String[] colNames)
Adapt model for given columns. Only permutes the columns by the column names (factor levels MUST match the training dataset).

Parameters:
colNames -
Returns:

score

public double score(double[] data)

score

public double score(ValueArray ary,
                    AutoBuffer bits,
                    int rid)

score0

protected abstract double score0(double[] data)

score0

protected double score0(ValueArray data,
                        int row)
Single row scoring, on a compatible ValueArray (when pushed throw the mapping)


score0

protected double score0(ValueArray data,
                        AutoBuffer ab,
                        int row_in_chunk)
Bulk scoring API, on a compatible ValueArray (when pushed throw the mapping)


toJson

public com.google.gson.JsonObject toJson()

fromJson

public void fromJson(com.google.gson.JsonObject json)