Skip to contents

Read a sparse matrix from a MatrixMarket file. This is a text-based format used by 10x, Parse, and others to store sparse matrices. Format details on the NIST website.

Usage

import_matrix_market(
  mtx_path,
  outdir = tempfile("matrix_market"),
  row_names = NULL,
  col_names = NULL,
  row_major = FALSE,
  tmpdir = tempdir(),
  load_bytes = 4194304L,
  sort_bytes = 1073741824L
)

import_matrix_market_10x(
  mtx_dir,
  outdir = tempfile("matrix_market"),
  feature_type = NULL,
  row_major = FALSE,
  tmpdir = tempdir(),
  load_bytes = 4194304L,
  sort_bytes = 1073741824L
)

Arguments

mtx_path

Path of mtx or mtx.gz file

outdir

Directory to store the output

row_names

Character vector of row names

col_names

Character vector of col names

row_major

If true, store the matrix in row-major orientation

tmpdir

Temporary directory to use for intermediate storage

load_bytes

The minimum contiguous load size during the merge sort passes

sort_bytes

The amount of memory to allocate for re-sorting chunks of entries

mtx_dir

Directory holding matrix.mtx.gz, barcodes.tsv.gz, and features.tsv.gz

feature_type

String or vector of feature types to include. (cellranger 3.0 and newer)

Value

MatrixDir object with the imported matrix

Details

Import MatrixMarket mtx files to the BPCells format. This implementation ensures fixed memory usage even for very large inputs by doing on-disk sorts. It will be much slower than hdf5 inputs, so only use MatrixMarket format when absolutely necessary.

As a rough speed estimate, importing the 17GB Parse 1M PBMC DGE_1M_PBMC.mtx file takes about 4 minutes and 1.3GB of RAM, producing a compressed output matrix of 1.5GB. mtx.gz files will be slower to import due to gzip decompression.

When importing from 10x mtx files, the row and column names can be read automatically using the import_matrix_market_10x() convenience function.