Skip to contents

Calling peaks from a pre-set list of tiles can be much faster than using dedicated peak-calling software like macs3. The resulting peaks are less precise in terms of exact coordinates, but should be sufficient for most analyses.

Usage

call_peaks_tile(
  fragments,
  chromosome_sizes,
  cell_groups = rep.int("all", length(cellNames(fragments))),
  effective_genome_size = NULL,
  peak_width = 200,
  peak_tiling = 3,
  fdr_cutoff = 0.01,
  merge_peaks = c("all", "group", "none")
)

Arguments

fragments

IterableFragments object

chromosome_sizes

Chromosome start and end coordinates given as GRanges, data.frame, or list. See help("genomic-ranges-like") for details on format and coordinate systems. Required attributes:

  • chr, start, end: genomic position

See read_ucsc_chrom_sizes().

cell_groups

Grouping vector with one entry per cell in fragments, e.g. cluster IDs

effective_genome_size

(Optional) effective genome size for poisson background rate estimation. See deeptools for values for common genomes. Defaults to sum of chromosome sizes, which overestimates peak significance

peak_width

Width of candidate peaks

peak_tiling

Number of candidate peaks overlapping each base of genome. E.g. peak_width = 300 and peak_tiling = 3 results in candidate peaks of 300bp spaced 100bp apart

fdr_cutoff

Adjusted p-value significance cutoff

merge_peaks

How to merge significant peaks with merge_peaks_iterative()

  • "all" Merge the full set of peaks

  • "group" Merge peaks within each group

  • "none" Don't perform any merging

Value

tibble with peak calls and the following columns:

  • chr, start, end: genome coordinates

  • group: group ID that this peak was identified in

  • p_val, q_val: Poission p-value and BH-corrected p-value

  • enrichment: Enrichment of counts in this peak compared to a genome-wide background

Details

Peak calling steps:

  1. Estimate the genome-wide expected insertions per tile based on peak_width, effective_genome_size, and per-group read counts

  2. Tile the genome with nonoverlapping tiles of size peak_width

  3. For each tile and group, calculate p_value based on a Poisson model

  4. Compute adjusted p-values using BH method and using the total number of tiles as the number of hypotheses tested.

  5. Repeat steps 2-4 peak_tiling times, with evenly spaced offsets

  6. If merge_peaks is "all" or "group": use merge_peaks_iterative() within each group to keep only the most significant of the overlapping candidate peaks

  7. If merge_peaks is "all", perform a final round of merge_peaks_iterative(), prioritizing each peak by its within-group significance rank