hex.rf
Class RFModel

java.lang.Object
  extended by water.Iced
      extended by water.OldModel
          extended by hex.rf.RFModel
All Implemented Interfaces:
java.lang.Cloneable, Freezable, Job.Progress

public class RFModel
extends OldModel
implements Job.Progress

A model is an ensemble of trees that can be serialized and that can be used to classify data.


Field Summary
 int _features
          Number of features these trees are built for
 Key[][] _localForests
          Local forests produced by nodes
 int[] _nodesSplitFeatures
          Number of computed split features per node - number of split features can differ for each node.
 Key[][] _remoteChunksKeys
          Remote chunks' keys used by individual nodes
 float _sample
          Sampling rate used when building trees.
 Sampling.Strategy _samplingStrategy
          Sampling strategy used for model
 int _splitFeatures
          Number of split features defined by user.
 float[] _strataSamples
          Strata sampling rate used for local-node strata-sampling
 long _time
          Total time in seconds to produce model
 Key[] _tkeys
          All the trees in the model
 int _totalTrees
          Number of keys the model expects to be built for it
 byte[][] _trees
           
static java.lang.String KEY_PREFIX
           
 
Fields inherited from class water.OldModel
_dataKey, _selfKey, _va, DOC_FIELDS
 
Constructor Summary
RFModel()
          Empty constructor for deserialization
RFModel(Key selfKey, int[] cols, Key dataKey, Key[] tkeys, int features, Sampling.Strategy samplingStrategy, float sample, float[] strataSamples, int splitFeatures, int totalTrees)
          A RandomForest Model
RFModel(Key selfKey, java.lang.String[] colNames, java.lang.String[] classNames, Key[] tkeys, int features, float sample)
           
 
Method Summary
 int classes()
           
 short classify(int[] votes, double[] classWt, java.util.Random rand)
           
 short classify(ValueArray data, AutoBuffer chunk, int row, int[] modelDataMap, int[] votes, double[] classWt, java.util.Random rand)
           
 short classify0(int tree_id, ValueArray data, AutoBuffer chunk, int row, int[] modelDataMap, short badrow)
          Classify a row according to one particular tree.
 void deleteKeys()
          Bad name, I know.
 Counter depth()
           
 void find_leaves_depth()
          Internal computation of depth and number of leaves.
 long getTreeSeed(int i)
          Return the random seed used to sample this tree.
 Counter leaves()
           
static RFModel make(RFModel old, Key tkey, int nodeIdx)
           
static Key makeKey()
           
 java.lang.String name(int atree)
           
 float progress()
           
protected  double score0(double[] data)
          Single row scoring, on properly ordered data
protected  double score0(ValueArray data, AutoBuffer ab, int row_in_chunk)
          Bulk scoring API, on a compatible ValueArray (when pushed throw the mapping)
protected  double score0(ValueArray data, int row)
          Single row scoring, on a compatible ValueArray (when pushed throw the mapping)
 int size()
           
 com.google.gson.JsonObject toJson()
           
 byte[] tree(int tree_id)
          Return the bits for a particular tree
 int treeCount()
          The number of trees in this model.
 
Methods inherited from class water.OldModel
adapt, adapt, columnFilter, columnMapping, delete, fromJson, identityMap, isCompatible, isCompatible, isCompatible, response, responseName, score, score
 
Methods inherited from class water.Iced
clone, frozenType, init, newInstance, read, toDocField, write, writeJSON, writeJSONFields
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_features

public int _features
Number of features these trees are built for


_samplingStrategy

public Sampling.Strategy _samplingStrategy
Sampling strategy used for model


_sample

public float _sample
Sampling rate used when building trees.


_strataSamples

public float[] _strataSamples
Strata sampling rate used for local-node strata-sampling


_splitFeatures

public int _splitFeatures
Number of split features defined by user.


_nodesSplitFeatures

public int[] _nodesSplitFeatures
Number of computed split features per node - number of split features can differ for each node. However, such difference would point to a problem with data distribution.


_totalTrees

public int _totalTrees
Number of keys the model expects to be built for it


_tkeys

public Key[] _tkeys
All the trees in the model


_localForests

public Key[][] _localForests
Local forests produced by nodes


_remoteChunksKeys

public Key[][] _remoteChunksKeys
Remote chunks' keys used by individual nodes


_time

public long _time
Total time in seconds to produce model


_trees

public transient byte[][] _trees

KEY_PREFIX

public static final java.lang.String KEY_PREFIX
See Also:
Constant Field Values
Constructor Detail

RFModel

public RFModel(Key selfKey,
               int[] cols,
               Key dataKey,
               Key[] tkeys,
               int features,
               Sampling.Strategy samplingStrategy,
               float sample,
               float[] strataSamples,
               int splitFeatures,
               int totalTrees)
A RandomForest Model

Parameters:
treeskey - a key of keys of trees
classes - the number of response classes
data - the dataset

RFModel

public RFModel(Key selfKey,
               java.lang.String[] colNames,
               java.lang.String[] classNames,
               Key[] tkeys,
               int features,
               float sample)

RFModel

public RFModel()
Empty constructor for deserialization

Method Detail

make

public static RFModel make(RFModel old,
                           Key tkey,
                           int nodeIdx)

makeKey

public static final Key makeKey()

treeCount

public int treeCount()
The number of trees in this model.


size

public int size()

classes

public int classes()

progress

public float progress()
Specified by:
progress in interface Job.Progress

name

public java.lang.String name(int atree)

tree

public byte[] tree(int tree_id)
Return the bits for a particular tree


deleteKeys

public void deleteKeys()
Bad name, I know. But free all internal tree keys.


classify0

public short classify0(int tree_id,
                       ValueArray data,
                       AutoBuffer chunk,
                       int row,
                       int[] modelDataMap,
                       short badrow)
Classify a row according to one particular tree.

Parameters:
tree_id - the number of the tree to use
chunk - the chunk we are using
row - the row number in the chunk
modelDataMap - mapping from model/tree columns to data columns
Returns:
the predicted response class, or class+1 for broken rows

classify

public short classify(ValueArray data,
                      AutoBuffer chunk,
                      int row,
                      int[] modelDataMap,
                      int[] votes,
                      double[] classWt,
                      java.util.Random rand)

classify

public short classify(int[] votes,
                      double[] classWt,
                      java.util.Random rand)

find_leaves_depth

public void find_leaves_depth()
Internal computation of depth and number of leaves.


leaves

public Counter leaves()

depth

public Counter depth()

getTreeSeed

public long getTreeSeed(int i)
Return the random seed used to sample this tree.


score0

protected double score0(double[] data)
Single row scoring, on properly ordered data

Specified by:
score0 in class OldModel

score0

protected double score0(ValueArray data,
                        int row)
Single row scoring, on a compatible ValueArray (when pushed throw the mapping)

Overrides:
score0 in class OldModel

score0

protected double score0(ValueArray data,
                        AutoBuffer ab,
                        int row_in_chunk)
Bulk scoring API, on a compatible ValueArray (when pushed throw the mapping)

Overrides:
score0 in class OldModel

toJson

public com.google.gson.JsonObject toJson()
Overrides:
toJson in class OldModel