Calling peaks from a pre-set list of tiles can be much faster than using
dedicated peak-calling software like macs3
. The resulting peaks are less
precise in terms of exact coordinates, but should be sufficient for most
analyses.
Arguments
- fragments
IterableFragments object
- chromosome_sizes
Chromosome start and end coordinates given as GRanges, data.frame, or list. See
help("genomic-ranges-like")
for details on format and coordinate systems. Required attributes:chr
,start
,end
: genomic position
- cell_groups
Grouping vector with one entry per cell in fragments, e.g. cluster IDs
- effective_genome_size
(Optional) effective genome size for poisson background rate estimation. See deeptools for values for common genomes. Defaults to sum of chromosome sizes, which overestimates peak significance
- peak_width
Width of candidate peaks
- peak_tiling
Number of candidate peaks overlapping each base of genome. E.g. peak_width = 300 and peak_tiling = 3 results in candidate peaks of 300bp spaced 100bp apart
- fdr_cutoff
Adjusted p-value significance cutoff
- merge_peaks
How to merge significant peaks with
merge_peaks_iterative()
"all"
Merge the full set of peaks"group"
Merge peaks within each group"none"
Don't perform any merging
Value
tibble with peak calls and the following columns:
chr
,start
,end
: genome coordinatesgroup
: group ID that this peak was identified inp_val
,q_val
: Poission p-value and BH-corrected p-valueenrichment
: Enrichment of counts in this peak compared to a genome-wide background
Details
Peak calling steps:
Estimate the genome-wide expected insertions per tile based on
peak_width
,effective_genome_size
, and per-group read countsTile the genome with nonoverlapping tiles of size peak_width
For each tile and group, calculate p_value based on a Poisson model
Compute adjusted p-values using BH method and using the total number of tiles as the number of hypotheses tested.
Repeat steps 2-4
peak_tiling
times, with evenly spaced offsetsIf
merge_peaks
is "all" or "group": usemerge_peaks_iterative()
within each group to keep only the most significant of the overlapping candidate peaksIf
merge_peaks
is "all", perform a final round ofmerge_peaks_iterative()
, prioritizing each peak by its within-group significance rank