Fragments can be subset based on overlapping (or not overlapping) a set of regions
Usage
select_regions(
fragments,
ranges,
invert_selection = FALSE,
zero_based_coords = !is(ranges, "GRanges")
)
Arguments
- fragments
Input fragments object.
- ranges
Peaks/ranges to overlap, given as GRanges, data.frame, or list. See
help("genomic-ranges-like")
for details on format and coordinate systems. Required attributes:chr
,start
,end
: genomic position
- invert_selection
If TRUE, select fragments not overlapping selected regions instead of only fragments overlapping the selected regions.
- zero_based_coords
Whether to convert the ranges from a 1-based end-inclusive coordinate system to a 0-based end-exclusive coordinate system. Defaults to true for GRanges and false for other formats (see this archived UCSC blogpost)
Examples
frags <- tibble::tibble(
chr = "chr1",
start = seq(10, 260, 50),
end = start + seq(5, 30, 5),
cell_id = "cell1"
)
frags
#> # A tibble: 6 × 4
#> chr start end cell_id
#> <chr> <dbl> <dbl> <chr>
#> 1 chr1 10 15 cell1
#> 2 chr1 60 70 cell1
#> 3 chr1 110 125 cell1
#> 4 chr1 160 180 cell1
#> 5 chr1 210 235 cell1
#> 6 chr1 260 290 cell1
frags <- frags %>% convert_to_fragments()
region <- tibble::tibble(
chr = "chr1",
start = 60,
end = 130
) %>% as("GRanges")
## Select ranges overlapping with region
select_regions(frags, region) %>% as("GRanges")
#> GRanges object with 2 ranges and 1 metadata column:
#> seqnames ranges strand | cell_id
#> <Rle> <IRanges> <Rle> | <factor>
#> [1] chr1 61-70 * | cell1
#> [2] chr1 111-125 * | cell1
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
## Select ranges not overlapping with region
select_regions(frags, region, invert_selection = TRUE) %>% as("GRanges")
#> GRanges object with 4 ranges and 1 metadata column:
#> seqnames ranges strand | cell_id
#> <Rle> <IRanges> <Rle> | <factor>
#> [1] chr1 11-15 * | cell1
#> [2] chr1 161-180 * | cell1
#> [3] chr1 211-235 * | cell1
#> [4] chr1 261-290 * | cell1
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths