Skip to contents

Calculate ranges x cells tile overlap matrix

Usage

tile_matrix(
  fragments,
  ranges,
  mode = c("insertions", "fragments"),
  zero_based_coords = !is(ranges, "GRanges"),
  explicit_tile_names = FALSE
)

Arguments

fragments

Input fragments object

ranges

Tiled regions given as GRanges, data.frame, or list. See help("genomic-ranges-like") for details on format and coordinate systems. Required attributes:

  • chr, start, end: genomic position

  • tile_width: Size of each tile in this region in basepairs

Must be non-overlapping and sorted by (chr, start), with chromosomes ordered according to the chromosome names of fragments

mode

Mode for counting tile overlaps. (See "value" section for more detail)

zero_based_coords

Whether to convert the ranges from a 1-based end-inclusive coordinate system to a 0-based end-exclusive coordinate system. Defaults to true for GRanges and false for other formats (see this archived UCSC blogpost)

explicit_tile_names

Boolean for whether to add rownames to the output matrix in format e.g chr1:500-1000, where start and end coords are given in a 0-based coordinate system. For whole-genome Tile matrices the names will take ~5 seconds to generate and take up 400MB of memory. Note that either way, tile names will be written when the matrix is saved.

Value

Iterable matrix object with dimension ranges x cells. When saved, the column names will be in the format chr1:500-1000, where start and end coords are given in a 0-based coordinate system.

mode options

  • "insertions": Start and end coordinates are separately overlapped with each tile

  • "fragments": Like "insertions", but each fragment can contribute at most 1 count to each tile, even if both the start and end coordinates overlap

Note

When calculating the matrix directly from a fragments tsv, it's necessary to first call select_chromosomes() in order to provide the ordering of chromosomes to expect while reading the tsv.