Skip to contents

Given a set of genomic ranges, find the distance to the nearest neighbors both upstream and downstream.

Usage

range_distance_to_nearest(
  ranges,
  addArchRBug = FALSE,
  zero_based_coords = !is(ranges, "GRanges")
)

Arguments

ranges

Genomic regions given as GRanges, data.frame, or list. See help("genomic-ranges-like") for details on format and coordinate systems. Required attributes:

  • chr, start, end: genomic position

  • strand: +/- or TRUE/FALSE for positive or negative strand

addArchRBug

boolean to reproduce ArchR's bug that incorrectly handles nested genes

zero_based_coords

If true, coordinates start and 0 and the end coordinate is not included in the range. If false, coordinates start at 1 and the end coordinate is included in the range

Value

A 2-column data.frame with columns upstream and downstream, containing the distances to the nearest neighbor in the respective directions. For ranges on + or * strand, distance is calculated as:

  • upstream = max(start(range) - end(upstreamNeighbor), 0)

  • downstream = max(start(downstreamNeighbor) - end(range), 0)

For ranges on - strand, the definition of upstream and downstream is flipped. Note that this definition of distance is one off from GenomicRanges::distance(), as ranges that neighbor but don't overlap are given a distance of 1 rather than 0.