hex.gbm
Class DBinHistogram
java.lang.Object
water.Iced
hex.gbm.DHistogram<DBinHistogram>
hex.gbm.DBinHistogram
- All Implemented Interfaces:
- java.lang.Cloneable, Freezable
public class DBinHistogram
- extends DHistogram<DBinHistogram>
A Histogram, computed in parallel over a Vec.
A DBinHistogram
bins every value added to it, and computes a the vec
min & max (for use in the next split), and response mean & variance for each
bin. DBinHistogram
s are initialized with a min, max and number-of-
elements to be added (all of which are generally available from a Vec).
Bins run from min to max in uniform sizes. If the DBinHistogram
can
determine that fewer bins are needed (e.g. boolean columns run from 0 to 1,
but only ever take on 2 values, so only 2 bins are needed), then fewer bins
are used.
If we are successively splitting rows (e.g. in a decision tree), then a
fresh DBinHistogram
for each split will dynamically re-bin the data.
Each successive split will logarithmically divide the data. At the first
split, outliers will end up in their own bins - but perhaps some central
bins may be very full. At the next split(s), the full bins will get split,
and again until (with a log number of splits) each bin holds roughly the
same amount of data. This dynamic binning resolves a lot of problems with
picking the proper bin count or limits - generally a few more tree levels
will equal any fancy but fixed-size binning strategy.
Constructor Summary |
DBinHistogram(java.lang.String name,
char nbins,
byte isInt,
float min,
float max,
long nelems)
|
Methods inherited from class java.lang.Object |
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
_step
public final float _step
_bmin
public final float _bmin
_nbins
public final char _nbins
_bins
public long[] _bins
_mins
public float[] _mins
_maxs
public float[] _maxs
DBinHistogram
public DBinHistogram(java.lang.String name,
char nbins,
byte isInt,
float min,
float max,
long nelems)
smallCopy
public DHistogram smallCopy()
- Overrides:
smallCopy
in class DHistogram<DBinHistogram>
bigCopy
public DBinHistogram bigCopy()
- Overrides:
bigCopy
in class DHistogram<DBinHistogram>
fini
public void fini()
tightenMinMax
public void tightenMinMax()
- Overrides:
tightenMinMax
in class DHistogram<DBinHistogram>
initialHist
public static DBinHistogram[] initialHist(Frame fr,
int ncols,
char nbins)
isConstantResponse
public boolean isConstantResponse()
toString
public java.lang.String toString()
- Overrides:
toString
in class java.lang.Object