Transpose the storage order for a matrix

## Arguments

- matrix
Input matrix

- outdir
Directory to store the output

- tmpdir
Temporary directory to use for intermediate storage

- load_bytes
The minimum contiguous load size during the merge sort passes

- sort_bytes
The amount of memory to allocate for re-sorting chunks of entries

## Details

This re-sorts the entries of a matrix to change the storage order from row-major to col-major. For large matrices, this can be slow -- around 2 minutes to transpose a 500k cell RNA-seq matrix The default load_bytes (4MiB) and sort_bytes (1GiB) parameters allow ~85GB of data to be sorted with two passes through the data, and ~7.3TB of data to be sorted in three passes through the data.