Normalize an object representing genomic ranges
Arguments
- ranges
Genomic regions given as GRanges, data.frame, or list. See
help("genomic-ranges-like")
for details on format and coordinate systems. Required attributes:chr
,start
,end
: genomic position
- metadata_cols
Optional list of metadata columns to require & extract
- zero_based_coords
If true, coordinates start and 0 and the end coordinate is not included in the range. If false, coordinates start at 1 and the end coordinate is included in the range
Value
data frame with zero-based coordinates, and elements chr (factor), start (int), and end (int).
If ranges
does not have chr level information, chr levels are the sorted unique values of chr.
If strand is in metadata_cols, then the output strand element will be TRUE for positive strand, and FALSE for negative strand. (Converted from a character vector of "+"/"-" if necessary)
Examples
## Prep data
ranges <- GenomicRanges::GRanges(
seqnames = S4Vectors::Rle(c("chr1", "chr2", "chr3"), c(1, 2, 2)),
ranges = IRanges::IRanges(101:105, end = 111:115, names = head(letters, 5)),
strand = S4Vectors::Rle(GenomicRanges::strand(c("-", "+", "*")), c(1, 2, 2)),
score = 1:5,
GC = seq(1, 0, length=5))
ranges
#> GRanges object with 5 ranges and 2 metadata columns:
#> seqnames ranges strand | score GC
#> <Rle> <IRanges> <Rle> | <integer> <numeric>
#> a chr1 101-111 - | 1 1.00
#> b chr2 102-112 + | 2 0.75
#> c chr2 103-113 + | 3 0.50
#> d chr3 104-114 * | 4 0.25
#> e chr3 105-115 * | 5 0.00
#> -------
#> seqinfo: 3 sequences from an unspecified genome; no seqlengths
## Normalize ranges
normalize_ranges(ranges)
#> # A tibble: 5 × 3
#> chr start end
#> <fct> <int> <int>
#> 1 chr1 100 111
#> 2 chr2 101 112
#> 3 chr2 102 113
#> 4 chr3 103 114
#> 5 chr3 104 115
## With metadata information
normalize_ranges(ranges, metadata_cols = c("strand", "score", "GC"))
#> # A tibble: 5 × 6
#> strand chr start end score GC
#> <lgl> <fct> <int> <int> <int> <dbl>
#> 1 FALSE chr1 100 111 1 1
#> 2 TRUE chr2 101 112 2 0.75
#> 3 TRUE chr2 102 113 3 0.5
#> 4 TRUE chr3 103 114 4 0.25
#> 5 TRUE chr3 104 115 5 0