Skip to contents

Read or write a matrix from an anndata hdf5 file. These functions will automatically transpose matrices when converting to/from the AnnData format. This is because the AnnData convention stores cells as rows, whereas the R convention stores cells as columns. If this behavior is undesired, call t() manually on the matrix inputs and outputs of these functions.

Most users writing to AnnData files should default to write_matrix_anndata_hdf5() rather than the dense variant (see details for more information).

Usage

open_matrix_anndata_hdf5(path, group = "X", buffer_size = 16384L)

write_matrix_anndata_hdf5(
  mat,
  path,
  group = "X",
  buffer_size = 16384L,
  chunk_size = 1024L,
  gzip_level = 0L
)

write_matrix_anndata_hdf5_dense(
  mat,
  path,
  dataset = "X",
  buffer_size = 16384L,
  chunk_size = 1024L,
  gzip_level = 0L
)

Arguments

path

Path to the hdf5 file on disk

group

The group within the hdf5 file to write the data to. If writing to an existing hdf5 file this group must not already be in use

buffer_size

For performance tuning only. The number of items to be buffered in memory before calling writes to disk.

chunk_size

For performance tuning only. The chunk size used for the HDF5 array storage.

gzip_level

Gzip compression level. Default is 0 (no compression)

dataset

The dataset within the hdf5 file to write the matrix to. Used for write_matrix_anndata_hdf5_dense

Value

AnnDataMatrixH5 object, with cells as the columns.

Details

Efficiency considerations: Reading from a dense AnnData matrix will generally be slower than sparse for single cell datasets, so it is recommended to re-write any dense AnnData inputs to a sparse format early in processing.

write_matrix_anndata_hdf5() should be used by default, as it always writes in the more efficient sparse format. write_matrix_anndata_hdf5_dense() writes in the AnnData dense format, and can be used for smaller matrices when efficiency and file size are less of a concern than increased portability (e.g. writing to obsm or varm matrices). See the AnnData docs for format details.

Dimension names: Dimnames are inferred from obs/_index or var/_index based on length matching. This helps to infer dimnames for obsp, varm, etc. If the number of len(obs) == len(var), dimname inference will be disabled.