pyarrow.SparseCSFTensor#

class pyarrow.SparseCSFTensor#

Bases: _Weakrefable

A sparse CSF tensor.

CSF is a generalization of compressed sparse row (CSR) index.

CSF index recursively compresses each dimension of a tensor into a set of prefix trees. Each path from a root to leaf forms one tensor non-zero index. CSF is implemented with two arrays of buffers and one arrays of integers.

Examples

>>> import pyarrow as pa
>>> import numpy as np
>>> # Create a 3D sparse tensor
>>> dense_tensor = np.zeros((2, 3, 2), dtype=np.float32)
>>> dense_tensor[0, 1, 0] = 1.0
>>> dense_tensor[1, 2, 1] = 2.0
>>> sparse_csf = pa.SparseCSFTensor.from_dense_numpy(dense_tensor)
>>> sparse_csf
<pyarrow.SparseCSFTensor>
type: float
shape: (2, 3, 2)

__init__(*args, **kwargs)#

Methods

`__init__`(args, *kwargs)
`dim_name`(self, i)	Returns the name of the i-th tensor dimension.
`equals`(self, SparseCSFTensor other)	Return true if sparse tensors contains exactly equal data
`from_dense_numpy`(cls, obj[, dim_names])	Convert numpy.ndarray to arrow::SparseCSFTensor
`from_numpy`(data, indptr, indices, shape[, ...])	Create arrow::SparseCSFTensor from numpy.ndarrays
`from_tensor`(obj)	Convert arrow::Tensor to arrow::SparseCSFTensor
`to_numpy`(self)	Convert arrow::SparseCSFTensor to numpy.ndarrays with zero copy
`to_tensor`(self)	Convert arrow::SparseCSFTensor to arrow::Tensor

Attributes

`dim_names`
`is_mutable`
`ndim`
`non_zero_length`
`shape`
`size`
`type`

dim_name(self, i)#

Returns the name of the i-th tensor dimension.

Parameters:

iint: The physical index of the tensor dimension.

Returns:

str

dim_names#

equals(self, SparseCSFTensor other)#

Return true if sparse tensors contains exactly equal data

Parameters:

otherSparseCSFTensor: The other tensor to compare for equality.

classmethod from_dense_numpy(cls, obj, dim_names=None)#

Convert numpy.ndarray to arrow::SparseCSFTensor

Parameters:

objnumpy.ndarray: Data used to populate the rows.
dim_nameslist[str], optional: Names of the dimensions.

Returns:

pyarrow.SparseCSFTensor

static from_numpy(data, indptr, indices, shape, axis_order=None, dim_names=None)#

Create arrow::SparseCSFTensor from numpy.ndarrays

Parameters:

datanumpy.ndarray: Data used to populate the sparse tensor.
indptrnumpy.ndarray: The sparsity structure. Each two consecutive dimensions in a tensor correspond to a buffer in indices. A pair of consecutive values at indptr[dim][i] indptr[dim][i + 1] signify a range of nodes in indices[dim + 1] who are children of indices[dim][i] node.
indicesnumpy.ndarray: Stores values of nodes. Each tensor dimension corresponds to a buffer in indptr.
shapetuple: Shape of the matrix.
axis_orderlist, optional: the sequence in which dimensions were traversed to produce the prefix tree.
dim_nameslist, optional: Names of the dimensions.

static from_tensor(obj)#

Convert arrow::Tensor to arrow::SparseCSFTensor

Parameters:

objTensor: The dense tensor that should be converted.

is_mutable#

ndim#

non_zero_length#

shape#

size#

to_numpy(self)#: Convert arrow::SparseCSFTensor to numpy.ndarrays with zero copy

to_tensor(self)#: Convert arrow::SparseCSFTensor to arrow::Tensor

type#