hex.gbm
Class DHistogram<T extends DHistogram>
java.lang.Object
water.Iced
hex.gbm.DHistogram<T>
- All Implemented Interfaces:
- java.lang.Cloneable, Freezable
- Direct Known Subclasses:
- DBinHistogram
public class DHistogram<T extends DHistogram>
- extends Iced
A DHistogram, computed in parallel over a Vec.
A DHistogram
bins (by default into bins)
every value added to it, and computes a the min, max, and either
class distribution or mean & variance for each bin. DHistogram
s are initialized with a min, max and number-of-elements
to be added (all of which are generally available from a Vec).
Bins normally run from min to max in uniform sizes, but if the
DHistogram
can determine that fewer bins are needed
(e.g. boolean columns run from 0 to 1, but only ever take on 2
values, so only 2 bins are needed), then fewer bins are used.
If we are successively splitting rows (e.g. in a decision tree), then a
fresh DHistogram
for each split will dynamically re-bin the data.
Each successive split then, will logarithmically divide the data. At the
first split, outliers will end up in their own bins - but perhaps some
central bins may be very full. At the next split(s), the full bins will get
split, and again until (with a log number of splits) each bin holds roughly
the same amount of data.
Constructor Summary |
DHistogram(java.lang.String name,
byte isInt)
|
DHistogram(java.lang.String name,
byte isInt,
float min,
float max)
|
Methods inherited from class java.lang.Object |
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DHistogram
public DHistogram(java.lang.String name,
byte isInt,
float min,
float max)
DHistogram
public DHistogram(java.lang.String name,
byte isInt)
smallCopy
public DHistogram smallCopy()
bigCopy
public DHistogram bigCopy()
tightenMinMax
public void tightenMinMax()
byteSize
protected static int byteSize(byte[] bs)
byteSize
protected static int byteSize(short[] ss)
byteSize
protected static int byteSize(float[] fs)
byteSize
protected static int byteSize(int[] is)
byteSize
protected static int byteSize(long[] ls)
byteSize
protected static int byteSize(double[] fs)
byteSize
protected static int byteSize(java.lang.Object[] ls)