hex
Class FrameTask<T extends FrameTask<T>>

java.lang.Object
  extended by jsr166y.ForkJoinTask<java.lang.Void>
      extended by jsr166y.CountedCompleter
          extended by water.H2O.H2OCountedCompleter
              extended by water.DTask
                  extended by water.MRTask2<T>
                      extended by hex.FrameTask<T>
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, java.util.concurrent.Future<java.lang.Void>, Freezable
Direct Known Subclasses:
GLMTask, Gram.GramTask

public abstract class FrameTask<T extends FrameTask<T>>
extends MRTask2<T>

See Also:
Serialized Form

Field Summary
protected  int[] _catOffsets
           
protected  int _cats
           
protected  int _nums
           
protected  boolean _standardize
           
 
Fields inherited from class water.MRTask2
_fr, _fs, _hi, _left, _lo, _nleft, _nodes, _nrite, _outputFrame, _rite, _topLocal
 
Fields inherited from class water.DTask
_cls, _eFromNode, _exception, _fname, _lineNum, _msg, _mth
 
Constructor Summary
protected FrameTask(FrameTask ft)
           
  FrameTask(Job job, boolean standardize, boolean hasResponse)
           
  FrameTask(Job job, boolean standardize, boolean hasResponse, int step, int offset, boolean complement)
           
 
Method Summary
 Frame adaptFrame(Frame fr)
          Reorder the frame's columns so that numeric columns come first followed by categoricals ordered by cardinality in decreasing order and the response is the last.
 int[] catOffsets()
           
 int cats()
           
protected  void chunkDone()
          Override this to do post-chunk processing work.
protected  void chunkInit()
          Override this to initialize at the beginning of chunk processing.
 T dfork2(Frame fr)
           
 T doIt(Frame fr)
           
protected  int fullN()
           
protected  int largestCat()
           
 void map(Chunk[] chunks)
          Extracts the values, applies regularization to numerics, adds appropriate offsets to categoricals, and adapts response according to the CaseMode/CaseValue if set.
 double[] normMul()
           
 double[] normSub()
           
 int nums()
           
protected  void processRow(double[] nums, int ncats, int[] cats)
           
protected  void processRow(double[] nums, int ncats, int[] cats, double response)
          Method to process one row of the data for GLM functions.
 boolean standardize()
           
 
Methods inherited from class water.MRTask2
clone, closeLocal, compute2, dfork, dfork, dfork, dfork, dinvoke, doAll, doAll, doAll, doAll, getResult, map, map, map, map, map, map, map, map, map, map, map, onCompletion, onExceptionalCompletion, postGlobal, profString, reduce, reduce4, setupLocal, vecs
 
Methods inherited from class water.DTask
copyOver, frozenType, getDException, hasException, logVerbose, newInstance, onAck, onAckAck, read, setException, toDocField, write, writeJSONFields
 
Methods inherited from class water.H2O.H2OCountedCompleter
compute, priority
 
Methods inherited from class jsr166y.CountedCompleter
addToPendingCount, compareAndSetPendingCount, complete, exec, getCompleter, getPendingCount, getRawResult, setCompleter, setPendingCount, setRawResult, tryComplete
 
Methods inherited from class jsr166y.ForkJoinTask
adapt, adapt, adapt, cancel, compareAndSetForkJoinTaskTag, completeExceptionally, fork, get, get, getException, getForkJoinTaskTag, getPool, getQueuedTaskCount, getSurplusQueuedTaskCount, helpQuiesce, inForkJoinPool, invoke, invokeAll, invokeAll, invokeAll, isCancelled, isCompletedAbnormally, isCompletedNormally, isDone, join, peekNextLocalTask, pollNextLocalTask, pollTask, quietlyComplete, quietlyInvoke, quietlyJoin, reinitialize, setForkJoinTaskTag, tryUnfork
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_standardize

protected final boolean _standardize

_nums

protected int _nums

_cats

protected int _cats

_catOffsets

protected int[] _catOffsets
Constructor Detail

FrameTask

public FrameTask(Job job,
                 boolean standardize,
                 boolean hasResponse)

FrameTask

public FrameTask(Job job,
                 boolean standardize,
                 boolean hasResponse,
                 int step,
                 int offset,
                 boolean complement)

FrameTask

protected FrameTask(FrameTask ft)
Method Detail

standardize

public final boolean standardize()

nums

public final int nums()

cats

public final int cats()

catOffsets

public final int[] catOffsets()

normSub

public final double[] normSub()

normMul

public final double[] normMul()

fullN

protected final int fullN()

largestCat

protected final int largestCat()

processRow

protected void processRow(double[] nums,
                          int ncats,
                          int[] cats,
                          double response)
Method to process one row of the data for GLM functions. Numeric and categorical values are passed separately, as is reponse. Categoricals are passed as absolute indexes into the expanded beta vector, 0-levels are skipped (so the number of passed categoricals will not be the same for every row). Categorical expansion/indexing: Categoricals are placed in the beginning of the beta vector. Each cat variable with n levels is expanded into n-1 independent binary variables. Indexes in cats[] will point to the appropriate coefficient in the beta vector, so e.g. assume we have 2 categorical columns both with values A,B,C, then the following rows will have following indexes: A,A - ncats = 0, we do not pass any categorical here A,B - ncats = 1, indexes = [2] B,B - ncats = 2, indexes = [0,2] and so on

Parameters:
nums - - numeric values of this row
ncats - - number of passed (non-zero) categoricals
cats - - indexes of categoricals into the expanded beta-vector.
response - - numeric value for the response

processRow

protected void processRow(double[] nums,
                          int ncats,
                          int[] cats)

adaptFrame

public Frame adaptFrame(Frame fr)
Reorder the frame's columns so that numeric columns come first followed by categoricals ordered by cardinality in decreasing order and the response is the last.

Parameters:
fr -
Returns:

doIt

public T doIt(Frame fr)

dfork2

public T dfork2(Frame fr)

chunkInit

protected void chunkInit()
Override this to initialize at the beginning of chunk processing.


chunkDone

protected void chunkDone()
Override this to do post-chunk processing work.


map

public final void map(Chunk[] chunks)
Extracts the values, applies regularization to numerics, adds appropriate offsets to categoricals, and adapts response according to the CaseMode/CaseValue if set.

Overrides:
map in class MRTask2<T extends FrameTask<T>>