bpcells.experimental.pseudobulk_insertion_counts#
- bpcells.experimental.pseudobulk_insertion_counts(fragments: str, regions: DataFrame, cell_groups: Sequence[int], bin_size: int = 1) ndarray[source]#
Calculate a pseudobulk coverage matrix
Coverage is calculated as the number of start/end coordinates falling into a given position bin.
- Parameters:
fragments (str) – Path to BPCells fragments directory
regions (pandas.DataFrame) – Pandas dataframe with columns (
chrom,start,end) representing genomic ranges (0-based, end-exclusive like BED format). All regions must be the same size.chromshould be a string column;start/endshould be numeric.cell_groups (list[int]) – List of pseudbulk groupings as created by
build_cell_groups()bin_size (int) – Size for bins within each region given in basepairs. If the region width is not an even multiple of
resolution_bp, then the last region may be truncated.
- Returns:
Numpy array with dimensions (region, psudobulks, position) and type numpy.int32
- Return type:
See also