hex.rf
Class ConfusionTask

java.lang.Object
  extended by jsr166y.ForkJoinTask<java.lang.Void>
      extended by jsr166y.CountedCompleter
          extended by water.H2O.H2OCountedCompleter
              extended by water.DTask<T>
                  extended by water.DRemoteTask<T>
                      extended by water.MRTask
                          extended by hex.rf.ConfusionTask
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, java.util.concurrent.Future<java.lang.Void>, Freezable

public class ConfusionTask
extends MRTask

Confusion Matrix. Incrementally computes a Confusion Matrix for a forest of Trees, vs a given input dataset. The set of Trees can grow over time. Each request from the Confusion compute on any new trees (if any), and report a matrix. Cheap if all trees already computed.

See Also:
Serialized Form

Nested Class Summary
static class ConfusionTask.CMFinal
           
static class ConfusionTask.CMJob
           
 
Field Summary
 ValueArray _data
           
 int _DATA_N
           
 int _MODEL_N
           
 int _N
           
 
Fields inherited from class water.MRTask
_hi, _lo
 
Fields inherited from class water.DRemoteTask
_fs, _is_local, _keys
 
Fields inherited from class water.DTask
_cls, _eFromNode, _exception, _fname, _lineNum, _msg, _mth
 
Constructor Summary
ConfusionTask()
          Constructor for use by the serializers
 
Method Summary
static int alignEnumDomains(java.lang.String[] modelDomain, java.lang.String[] dataDomain, int[] modelMapping, int[] dataMapping)
          Merge model and data predictor domain to produce domain for CM.
 int dimension()
          Return number of classes - in fact dimension of CM.
 java.lang.String[] domain()
          Compute confusion matrix domain based on model and data key.
static java.lang.String[] domain(int N, ValueArray.Column modelCol, ValueArray.Column dataCol, int[] modelEnumMapping, int[] dataEnumMapping)
           
static java.lang.String[] domain(ValueArray.Column modelCol, ValueArray.Column dataCol)
           
 void init()
          Once-per-remote invocation init.
 Key keyForCM()
           
static Key keyForCM(Key modelKey, int msize, Key datakey, int classcol, boolean computeOOB)
           
static ConfusionTask.CMJob make(RFModel model, int modelSize, Key datakey, int classcol, double[] classWt, boolean computeOOB)
           
static ConfusionTask.CMJob make(RFModel model, Key datakey, int classcol, double[] classWt, boolean computeOOB)
          Apply a model to a dataset to produce a Confusion Matrix.
 void map(Key chunkKey)
          A classic Map/Reduce style incremental computation of the confusion matrix on a chunk of data.
 void reduce(DRemoteTask drt)
          Reduction combines the confusion matrices.
static void remove(RFModel model, Key datakey, int classcol, boolean computeOOB)
           
 
Methods inherited from class water.MRTask
hi, lcompute, lo, lonCompletion, memOverheadPerChunk, onExceptionalCompletion
 
Methods inherited from class water.DRemoteTask
alsoBlockFor, alsoBlockFor, clone, compute2, dfork, getFutures, invoke, invokeOnAllNodes, keys, merge, merge, merge, onCompletion, reduceAlsoBlock
 
Methods inherited from class water.DTask
copyOver, dinvoke, frozenType, getDException, hasException, logVerbose, newInstance, onAck, onAckAck, read, setException, toDocField, write, writeJSONFields
 
Methods inherited from class water.H2O.H2OCountedCompleter
compute, priority
 
Methods inherited from class jsr166y.CountedCompleter
addToPendingCount, compareAndSetPendingCount, complete, exec, getCompleter, getPendingCount, getRawResult, setCompleter, setPendingCount, setRawResult, tryComplete
 
Methods inherited from class jsr166y.ForkJoinTask
adapt, adapt, adapt, cancel, compareAndSetForkJoinTaskTag, completeExceptionally, fork, get, get, getException, getForkJoinTaskTag, getPool, getQueuedTaskCount, getSurplusQueuedTaskCount, helpQuiesce, inForkJoinPool, invoke, invokeAll, invokeAll, invokeAll, isCancelled, isCompletedAbnormally, isCompletedNormally, isDone, join, peekNextLocalTask, pollNextLocalTask, pollTask, quietlyComplete, quietlyInvoke, quietlyJoin, reinitialize, setForkJoinTaskTag, tryUnfork
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_data

public transient ValueArray _data

_N

public transient int _N

_MODEL_N

public transient int _MODEL_N

_DATA_N

public transient int _DATA_N
Constructor Detail

ConfusionTask

public ConfusionTask()
Constructor for use by the serializers

Method Detail

keyForCM

public Key keyForCM()

keyForCM

public static Key keyForCM(Key modelKey,
                           int msize,
                           Key datakey,
                           int classcol,
                           boolean computeOOB)

remove

public static void remove(RFModel model,
                          Key datakey,
                          int classcol,
                          boolean computeOOB)

make

public static ConfusionTask.CMJob make(RFModel model,
                                       Key datakey,
                                       int classcol,
                                       double[] classWt,
                                       boolean computeOOB)
Apply a model to a dataset to produce a Confusion Matrix. To support incremental & repeated model application, hash the model & data and look for that Key to already exist, returning a prior CM if one is available.


make

public static ConfusionTask.CMJob make(RFModel model,
                                       int modelSize,
                                       Key datakey,
                                       int classcol,
                                       double[] classWt,
                                       boolean computeOOB)

init

public void init()
Once-per-remote invocation init. The standard M/R framework will endlessly clone the original object "for free" (well, for very low cost), but the wire-line format does not send over things we can compute locally. So compute locally, once, some things we want in all cloned instances.

Overrides:
init in class MRTask

map

public void map(Key chunkKey)
A classic Map/Reduce style incremental computation of the confusion matrix on a chunk of data.

Specified by:
map in class MRTask

reduce

public void reduce(DRemoteTask drt)
Reduction combines the confusion matrices.

Specified by:
reduce in class DRemoteTask

alignEnumDomains

public static int alignEnumDomains(java.lang.String[] modelDomain,
                                   java.lang.String[] dataDomain,
                                   int[] modelMapping,
                                   int[] dataMapping)
Merge model and data predictor domain to produce domain for CM. The domain is expected to be ordered and containing unique values.


domain

public static java.lang.String[] domain(ValueArray.Column modelCol,
                                        ValueArray.Column dataCol)

domain

public static java.lang.String[] domain(int N,
                                        ValueArray.Column modelCol,
                                        ValueArray.Column dataCol,
                                        int[] modelEnumMapping,
                                        int[] dataEnumMapping)

domain

public java.lang.String[] domain()
Compute confusion matrix domain based on model and data key.


dimension

public final int dimension()
Return number of classes - in fact dimension of CM.