Read or write a matrix from an anndata hdf5 file. These functions will
automatically transpose matrices when converting to/from the AnnData
format. This is because the AnnData convention stores cells as rows, whereas the R
convention stores cells as columns. If this behavior is undesired, call t()
manually on the matrix inputs and outputs of these functions.
Most users writing to AnnData files should default to write_matrix_anndata_hdf5()
rather
than the dense variant (see details for more information).
Usage
open_matrix_anndata_hdf5(path, group = "X", buffer_size = 16384L)
write_matrix_anndata_hdf5(
mat,
path,
group = "X",
buffer_size = 16384L,
chunk_size = 1024L,
gzip_level = 0L
)
write_matrix_anndata_hdf5_dense(
mat,
path,
dataset = "X",
buffer_size = 16384L,
chunk_size = 1024L,
gzip_level = 0L
)
Arguments
- path
Path to the hdf5 file on disk
- group
The group within the hdf5 file to write the data to. If writing to an existing hdf5 file this group must not already be in use
- buffer_size
For performance tuning only. The number of items to be buffered in memory before calling writes to disk.
- chunk_size
For performance tuning only. The chunk size used for the HDF5 array storage.
- gzip_level
Gzip compression level. Default is 0 (no compression)
- dataset
The dataset within the hdf5 file to write the matrix to. Used for
write_matrix_anndata_hdf5_dense
Details
Efficiency considerations: Reading from a dense AnnData matrix will generally be slower than sparse for single cell datasets, so it is recommended to re-write any dense AnnData inputs to a sparse format early in processing.
write_matrix_anndata_hdf5()
should be used by default, as it always writes in the more efficient sparse format.
write_matrix_anndata_hdf5_dense()
writes in the AnnData dense format, and can be used for smaller matrices
when efficiency and file size are less of a concern than increased portability (e.g. writing to obsm
or varm
matrices).
See the AnnData docs for format details.
Dimension names: Dimnames are inferred from obs/_index
or var/_index
based on length matching.
This helps to infer dimnames for obsp
, varm
, etc. If the number of len(obs) == len(var)
,
dimname inference will be disabled.