Normalizations and PCA
Avoid dense matrices whenever possible. Put normalizations that preserve sparsity (0 values stay 0) before normalizations that break sparsity (e.g. adding values to each row/column). A typical RNAseq matrix has <5% nonzero entries, so your code will operate on 20x more entries with a dense matrix.

For most operations, we recommend using lazy evaluation to avoid creating intermediate matrices. The one common exception to this rule is when running PCA. Because PCA requires looping through the matrix several hundred times, it is often faster to write the matrix to disk once just before PCA rather than recalculating the entries on each PCA iteration.
 For storage efficiency, keep any sparsitybreaking normalizations
delayed, but store all the sparse normalizations in a temporary location
with
write_matrix_dir()
then apply the sparsitybreaking normalizations
 For storage efficiency, keep any sparsitybreaking normalizations
delayed, but store all the sparse normalizations in a temporary location
with
Adding values to the rows/columns of a matrix has very little overhead for PCA because it translates into a pre or post processing step before each matvec multiply iteration. As a sparsitybreaking operation, adding a vector to the matrix causes most other operations to become more expensive, however.
Storage order
 Sparse matrices can be stored in a rowmajor or columnmajor
orientation with BPCells. Along the indexed dimension (e.g. rows for
rowmajor), BPCells can efficiently seek to a selected column without
reading the whole matrix. This has performance implications for certain
operations:
 Marker features can only be computed on a matrix indexed by gene/feature.
 Sparse matrix multiplication can only be performed between matrices with the same storage order
 Sparse matrix multiplication performance can change dramatically depending on the storage order and relative matrix size/sparsity. For columnmajor matrices, the left matrix should be fast to load and contain few delayed operations, while the right matrix can be slow to load and contain many delayed operations. For rowmajor matrices the left/right preferences are reversed.
 You can check the storage order for a matrix by printing it out in the R terminal
 When calling the
t()
function, BPCells just flips a boolean flag for whether the matrix is rowmajor or columnmajor. This does not affect the underlying storage order.  To adjust the underlying storage order, call
transpose_storage_order()
. This is a slower operation, that requires writing a new copy of the data to disk.
Other tips
 Use a single call to
matrix_stats()
to calculate mean + variance in a single pass through the matrix when possible. See the function reference for details.  For ATACseq data, you can calculate variable features on the tile matrix without ever saving it to disk. This allows you to subset to variable tiles and create a peak matrix with just your variable tiles for some space savings.