Skip to contents

ArchR-style gene activity scores are based on a weighted sum of each tile according to the signed distance from the tile to a gene body. This function calculates the signed distances according to ArchR's default parameters.

Usage

gene_score_tiles_archr(
  genes,
  chromosome_sizes = NULL,
  tile_width = 500,
  addArchRBug = FALSE
)

Arguments

genes

Gene coordinates given as GRanges, data.frame, or list. See help("genomic-ranges-like") for details on format and coordinate systems. Required attributes:

  • chr, start, end: genomic position

  • strand: +/- or TRUE/FALSE for positive or negative strand

chromosome_sizes

(optional) Size of chromosomes as a genomic-ranges object

tile_width

Size of tiles to consider

addArchRBug

Replicate ArchR bug in handling nested genes

Value

Tibble with one range per tile, with additional metadata columns gene_idx (row index of the gene this tile corresponds to) and distance.

Distance is a signed distance calculated such that if the tile has a smaller start coordinate than the gene and the gene is on the + strand, distance will be negative. The distance of adjacent but non-overlapping regions is 1bp, counting up from there.

Details

ArchR's tile distance algorithm works as follows

  1. Genes are extended 5kb upstream

  2. Genes are linked to any tiles 1kb-100kb upstream + downstream, but tiles beyond a neighboring gene are not considered