|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectwater.Iced
water.fvec.Vec
public class Vec
A single distributed vector column.
A distributed vector has a count of elements, an element-to-chunk mapping, a Java type (mostly determines rounding on store and display), and functions to directly load elements without further indirections. The data is compressed, or backed by disk or both. *Writing* to elements may throw if the backing data is read-only (file backed).
Vec Key format is: Key. VEC - byte, 0 - byte, 0 - int, normal Key bytes. DVec Key format is: Key.DVEC - byte, 0 - byte, chunk# - int, normal Key bytes.The main API is at, set, and isNA:
double at ( long row ); // Returns the value expressed as a double. NaN if missing. long at8 ( long row ); // Returns the value expressed as a long. Throws if missing. boolean isNA( long row ); // True if the value is missing. set( long row, double d ); // Stores a double; NaN will be treated as missing. set( long row, long l ); // Stores a long; throws if l exceeds what fits in a double & any floats are ever set. setNA( long row ); // Sets the value as missing.Note this dangerous scenario: loading a missing value as a double, and setting it as a long:
set(row,(long)at(row)); // Danger!The cast from a Double.NaN to a long produces a zero! This code will replace a missing value with a zero.
Nested Class Summary | |
---|---|
static class |
Vec.CollectDomain
|
static class |
Vec.VectorGroup
Class representing the group of vectors. |
Field Summary | |
---|---|
java.lang.String[] |
_domain
Enum/factor/categorical names. |
Key |
_key
Key mapping a Value which holds this Vec. |
static int |
LOG_CHK
Log-2 of Chunk size. |
Constructor Summary | |
---|---|
Vec(Key key,
double d)
|
Method Summary | |
---|---|
Vec |
adaptTo(Vec v,
boolean exact)
Adapt given vector v to this vector. |
void |
asEnum()
Deprecated. |
double |
at(long i)
Fetch element the slow way, as a double. |
long |
at8(long i)
Fetch element the slow way, as a long. |
long |
byteSize()
Size of compressed vector data. |
Chunk |
chunk(long i)
The Chunk for a row#. |
long |
chunk2StartElem(int cidx)
Convert a chunk-index into a starting row #. |
Value |
chunkIdx(int cidx)
Get a Chunk's Value by index. |
Key |
chunkKey(int cidx)
Get a Chunk Key from a chunk-index. |
int |
chunkLen(int cidx)
Number of rows in chunk. |
java.lang.String[] |
defaultLevels()
Deprecated. |
java.lang.String[] |
domain()
Return an array of domains. |
java.lang.String |
domain(long i)
Map the integer value for a enum/factor/categorical to it's String. |
Chunk |
elem2BV(int cidx)
The Chunk for a chunk#. |
Vec.VectorGroup |
group()
Get the group this vector belongs to. |
boolean |
isEnum()
Is the column a factor/categorical/enum? Note: all "isEnum()" columns are are also "isInt()" but not vice-versa. |
boolean |
isInt()
Is all integers? |
boolean |
isNA(long row)
Fetch the missing-status the slow way. |
long |
length()
Number of elements in the vector. |
Vec |
makeCon(double d)
|
Vec |
makeCon(long l)
Make a new vector with the same size and data layout as the old one, and initialized to a constant. |
Vec |
makeTransf(int[] domMap)
|
Vec |
makeTransf(int[] domMap,
java.lang.String[] domain)
|
Vec |
makeZero()
Make a new vector with the same size and data layout as the old one, and initialized to zero. |
double |
max()
Return column max - lazily computed as needed. |
double |
mean()
Return column mean - lazily computed as needed. |
double |
min()
Return column min - lazily computed as needed. |
long |
naCnt()
Return column missing-element-count - lazily computed as needed. |
int |
nChunks()
Number of chunks. |
void |
postWrite()
Stop writing into this Vec. |
protected boolean |
readable()
Default read/write behavior for Vecs. |
void |
remove(Futures fs)
|
Vec |
rollupStats()
Compute the roll-up stats as-needed, and copy into the Vec object |
void |
rollupStats(Futures fs)
|
void |
rollupStats(H2O.H2OCountedCompleter cc)
|
double |
set(long i,
double d)
Write element the slow way, as a double. |
float |
set(long i,
float f)
Write element the slow way, as a float. |
long |
set(long i,
long l)
Write element the slow way, as a long. |
boolean |
setNA(long i)
Set the element as missing the slow way. |
double |
sigma()
Return column standard deviation - lazily computed as needed. |
Vec |
toEnum()
Transform this vector to enum. |
java.lang.String |
toString()
Pretty print the Vec: [#elems, min/mean/max]{chunks,...} |
protected boolean |
writable()
Default read/write behavior for Vecs. |
Methods inherited from class water.Iced |
---|
clone, frozenType, init, newInstance, read, toDocField, write, writeJSON, writeJSONFields |
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final int LOG_CHK
public final Key _key
public java.lang.String[] _domain
Constructor Detail |
---|
public Vec(Key key, double d)
Method Detail |
---|
public Vec makeZero()
public Vec makeCon(long l)
public Vec makeCon(double d)
public Vec makeTransf(int[] domMap)
public Vec makeTransf(int[] domMap, java.lang.String[] domain)
public Vec adaptTo(Vec v, boolean exact)
v
to this vector.
I.e., unify domains and call makeTransf().
public long length()
public int nChunks()
public final boolean isEnum()
public java.lang.String domain(long i)
public java.lang.String[] domain()
@Deprecated public void asEnum()
toEnum()
is ALWAYS responsible
for its deletion!!!
public Vec toEnum()
TransfVec
which provides a mapping between values.
The caller is responsible for vector deletion!
@Deprecated public java.lang.String[] defaultLevels()
protected boolean readable()
protected boolean writable()
public double min()
public double max()
public double mean()
public double sigma()
public long naCnt()
public boolean isInt()
public long byteSize()
public Vec rollupStats()
public void rollupStats(Futures fs)
public void rollupStats(H2O.H2OCountedCompleter cc)
public void postWrite()
public long chunk2StartElem(int cidx)
public int chunkLen(int cidx)
public Key chunkKey(int cidx)
public Value chunkIdx(int cidx)
DKV.get
. Warning: this pulls the data locally;
using this call on every Chunk index on the same node will
probably trigger an OOM!
public final Vec.VectorGroup group()
public Chunk elem2BV(int cidx)
public final Chunk chunk(long i)
public final long at8(long i)
public final double at(long i)
public final boolean isNA(long row)
public final long set(long i, long l)
public final double set(long i, double d)
public final float set(long i, float f)
public final boolean setNA(long i)
public java.lang.String toString()
toString
in class java.lang.Object
public void remove(Futures fs)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |