Skip to contents

Fragments can be subset based on overlapping (or not overlapping) a set of regions

Usage

select_regions(
  fragments,
  ranges,
  invert_selection = FALSE,
  zero_based_coords = !is(ranges, "GRanges")
)

Arguments

fragments

Input fragments object.

ranges

Peaks/ranges to overlap, given as GRanges, data.frame, or list. See help("genomic-ranges-like") for details on format and coordinate systems. Required attributes:

  • chr, start, end: genomic position

invert_selection

If TRUE, select fragments not overlapping selected regions instead of only fragments overlapping the selected regions.

zero_based_coords

Whether to convert the ranges from a 1-based end-inclusive coordinate system to a 0-based end-exclusive coordinate system. Defaults to true for GRanges and false for other formats (see this archived UCSC blogpost)

Value

Fragments object filtered according to the selected regions

Examples

frags <- tibble::tibble(
  chr = "chr1",
  start = seq(10, 260, 50),
  end = start + seq(5, 30, 5),
  cell_id = "cell1"
) 
frags
#> # A tibble: 6 × 4
#>   chr   start   end cell_id
#>   <chr> <dbl> <dbl> <chr>  
#> 1 chr1     10    15 cell1  
#> 2 chr1     60    70 cell1  
#> 3 chr1    110   125 cell1  
#> 4 chr1    160   180 cell1  
#> 5 chr1    210   235 cell1  
#> 6 chr1    260   290 cell1  
frags <- frags %>% convert_to_fragments()

region <- tibble::tibble(
  chr = "chr1",
  start = 60,
  end = 130
) %>% as("GRanges")

## Select ranges overlapping with region
select_regions(frags, region) %>% as("GRanges")
#> GRanges object with 2 ranges and 1 metadata column:
#>       seqnames    ranges strand |  cell_id
#>          <Rle> <IRanges>  <Rle> | <factor>
#>   [1]     chr1     61-70      * |    cell1
#>   [2]     chr1   111-125      * |    cell1
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

## Select ranges not overlapping with region
select_regions(frags, region, invert_selection = TRUE) %>% as("GRanges")
#> GRanges object with 4 ranges and 1 metadata column:
#>       seqnames    ranges strand |  cell_id
#>          <Rle> <IRanges>  <Rle> | <factor>
#>   [1]     chr1     11-15      * |    cell1
#>   [2]     chr1   161-180      * |    cell1
#>   [3]     chr1   211-235      * |    cell1
#>   [4]     chr1   261-290      * |    cell1
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths