Calculate gene-tile distances for ArchR gene activities
Source:R/geneScores.R
gene_score_tiles_archr.Rd
ArchR-style gene activity scores are based on a weighted sum of each tile according to the signed distance from the tile to a gene body. This function calculates the signed distances according to ArchR's default parameters.
Usage
gene_score_tiles_archr(
genes,
chromosome_sizes = NULL,
tile_width = 500,
addArchRBug = FALSE
)
Arguments
- genes
Gene coordinates given as GRanges, data.frame, or list. See
help("genomic-ranges-like")
for details on format and coordinate systems. Required attributes:chr
,start
,end
: genomic positionstrand
: +/- or TRUE/FALSE for positive or negative strand
- chromosome_sizes
(optional) Size of chromosomes as a genomic-ranges object
- tile_width
Size of tiles to consider
- addArchRBug
Replicate ArchR bug in handling nested genes
Value
Tibble with one range per tile, with additional metadata columns gene_idx (row index of the gene this tile corresponds to) and distance.
Distance is a signed distance calculated such that if the tile has a smaller start coordinate than the gene and the gene is on the + strand, distance will be negative. The distance of adjacent but non-overlapping regions is 1bp, counting up from there.
Details
ArchR's tile distance algorithm works as follows
Genes are extended 5kb upstream
Genes are linked to any tiles 1kb-100kb upstream + downstream, but tiles beyond a neighboring gene are not considered
Examples
## Prep data
directory <- file.path(tempdir(), "references")
genes <- read_gencode_genes(
directory,
release = "42",
annotation_set = "basic",
)
## Get gene scores by tile
gene_score_tiles_archr(
genes
)
#> # A tibble: 6,900,314 × 5
#> chr start end gene_idx distance
#> <fct> <dbl> <dbl> <int> <dbl>
#> 1 chr1 0 500 1 -6369
#> 2 chr1 500 1000 1 -5869
#> 3 chr1 1000 1500 1 -5369
#> 4 chr1 1500 2000 1 -4869
#> 5 chr1 2000 2500 1 -4369
#> 6 chr1 2500 3000 1 -3869
#> 7 chr1 3000 3500 1 -3369
#> 8 chr1 3500 4000 1 -2869
#> 9 chr1 4000 4500 1 -2369
#> 10 chr1 4500 5000 1 -1869
#> # ℹ 6,900,304 more rows