water.fvec
Class Chunk

java.lang.Object
  extended by water.Iced
      extended by water.fvec.Chunk
All Implemented Interfaces:
java.lang.Cloneable, Freezable
Direct Known Subclasses:
C0DChunk, C0LChunk, C1Chunk, C1NChunk, C1SChunk, C2Chunk, C2SChunk, C4Chunk, C4FChunk, C4SChunk, C8Chunk, C8DChunk, CBSChunk, NewChunk

public abstract class Chunk
extends Iced
implements java.lang.Cloneable

A compression scheme, over a chunk - a single array of bytes. The *actual* vector header info is in the Vec struct - which contains info to find all the bytes of the distributed vector. This struct is basically a 1-entry chunk cache of the total vector. Subclasses of this abstract class implement (possibly empty) compression schemes.


Field Summary
protected  Chunk _chk
           
 int _len
           
 long _start
           
 Vec _vec
           
 
Method Summary
 double at_slow(long i)
          Slightly slower than 'at0' inside a chunk; goes (very) slow outside the chunk instead of throwing.
 double at(long i)
          Load a double value.
 double at0(int i)
          The zero-based API.
protected abstract  long at8_impl(int idx)
           
 long at8_slow(long i)
           
 long at8(long i)
          Load a long value.
 long at80(int i)
           
protected abstract  double atd_impl(int idx)
          Chunk-specific readers.
 long byteSize()
           
 int cidx()
           
 Chunk clone()
           
 void close(int cidx, Futures fs)
          After writing we must call close() to register the bulk changes
 byte[] getBytes()
           
protected abstract  boolean isNA_impl(int idx)
           
 boolean isNA_slow(long i)
           
 boolean isNA(long i)
          Fetch the missing-status the slow way.
 boolean isNA0(int i)
           
 int pformat_len()
           
protected  int pformat_len0()
           
protected  int pformat_len0(double scale, int lg)
           
 java.lang.String pformat()
           
protected  java.lang.String pformat0()
           
abstract  Chunk read(AutoBuffer bb)
          Deserialize from the AutoBuffer into a pre-existing 'this' object.
 boolean readable()
           
 double set(long i, double d)
          Write element the slow way, as a double.
 float set(long i, float f)
          Write element the slow way, as a float.
 long set(long i, long l)
          Write element the slow way, as a long.
 double set0(int idx, double d)
          Set a double element in a chunk given a 0-based chunk local index.
 float set0(int idx, float f)
          Set a floating element in a chunk given a 0-based chunk local index.
 long set0(int idx, long l)
          Set a long element in a chunk given a 0-based chunk local index.
 boolean setNA(long i)
          Set the element as missing the slow way.
 boolean setNA0(int idx)
          Set the element in a chunk as missing given a 0-based chunk local index.
 java.lang.String toString()
           
 boolean writable()
           
abstract  AutoBuffer write(AutoBuffer bb)
          Chunk-specific implementations of read & write
 
Methods inherited from class water.Iced
frozenType, init, newInstance, toDocField, writeJSON, writeJSONFields
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

_start

public long _start

_len

public int _len

_chk

protected Chunk _chk

_vec

public Vec _vec
Method Detail

readable

public final boolean readable()

writable

public final boolean writable()

getBytes

public final byte[] getBytes()

at8

public final long at8(long i)
Load a long value. Floating point values are silently rounded to an integer. Throws if the value is missing.

Loads from the 1-entry chunk cache, or misses-out. This version uses absolute element numbers, but must convert them to chunk-relative indices - requiring a load from an aliasing local var, leading to lower quality JIT'd code (similar issue to using iterator objects).

Slightly slower than 'at0' since it range checks within a chunk.


at

public final double at(long i)
Load a double value. Returns Double.NaN if value is missing.

Loads from the 1-entry chunk cache, or misses-out. This version uses absolute element numbers, but must convert them to chunk-relative indices - requiring a load from an aliasing local var, leading to lower quality JIT'd code (similar issue to using iterator objects).

Slightly slower than 'at80' since it range checks within a chunk.


isNA

public final boolean isNA(long i)
Fetch the missing-status the slow way.


at0

public final double at0(int i)
The zero-based API. Somewhere between 10% to 30% faster in a tight-loop over the data than the generic at() API. Probably no gain on larger loops. The row reference is zero-based on the chunk, and should range-check by the JIT as expected.


at80

public final long at80(int i)

isNA0

public final boolean isNA0(int i)

at_slow

public final double at_slow(long i)
Slightly slower than 'at0' inside a chunk; goes (very) slow outside the chunk instead of throwing. First outside-chunk fetches & caches whole chunk; maybe takes multiple msecs. 2nd & later touches in the same outside-chunk probably run 100x slower than inside-chunk accesses.


at8_slow

public final long at8_slow(long i)

isNA_slow

public final boolean isNA_slow(long i)

set

public final long set(long i,
                      long l)
Write element the slow way, as a long. There is no way to write a missing value with this call. Under rare circumstances this can throw: if the long does not fit in a double (value is larger magnitude than 2^52), AND float values are stored in Vector. In this case, there is no common compatible data representation.


set

public final double set(long i,
                        double d)
Write element the slow way, as a double. Double.NaN will be treated as a set of a missing element.


set

public final float set(long i,
                       float f)
Write element the slow way, as a float. Float.NaN will be treated as a set of a missing element.


setNA

public final boolean setNA(long i)
Set the element as missing the slow way.


set0

public final long set0(int idx,
                       long l)
Set a long element in a chunk given a 0-based chunk local index. Write into a chunk. May rewrite/replace chunks if the chunk needs to be "inflated" to hold larger values. Returns the input value. Note that the idx is an int (instead of a long), which tells you that index 0 is the first row in the chunk, not the whole Vec.


set0

public final double set0(int idx,
                         double d)
Set a double element in a chunk given a 0-based chunk local index.


set0

public final float set0(int idx,
                        float f)
Set a floating element in a chunk given a 0-based chunk local index.


setNA0

public final boolean setNA0(int idx)
Set the element in a chunk as missing given a 0-based chunk local index.


close

public void close(int cidx,
                  Futures fs)
After writing we must call close() to register the bulk changes


cidx

public int cidx()

atd_impl

protected abstract double atd_impl(int idx)
Chunk-specific readers.


at8_impl

protected abstract long at8_impl(int idx)

isNA_impl

protected abstract boolean isNA_impl(int idx)

write

public abstract AutoBuffer write(AutoBuffer bb)
Chunk-specific implementations of read & write

Specified by:
write in interface Freezable
Overrides:
write in class Iced

read

public abstract Chunk read(AutoBuffer bb)
Description copied from interface: Freezable
Deserialize from the AutoBuffer into a pre-existing 'this' object.

Specified by:
read in interface Freezable
Overrides:
read in class Iced

pformat

public java.lang.String pformat()

pformat_len

public int pformat_len()

pformat0

protected java.lang.String pformat0()

pformat_len0

protected int pformat_len0()

pformat_len0

protected int pformat_len0(double scale,
                           int lg)

clone

public Chunk clone()
Overrides:
clone in class Iced

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

byteSize

public long byteSize()