bpcells.experimental.pseudobulk_insertion_counts#
- bpcells.experimental.pseudobulk_insertion_counts(fragments: str, regions: DataFrame, cell_groups: Sequence[int], bin_size: int = 1) ndarray [source]#
Calculate a pseudobulk coverage matrix
Coverage is calculated as the number of start/end coordinates falling into a given position bin.
- Parameters:
fragments (str) – Path to BPCells fragments directory
regions (pandas.DataFrame) – Pandas dataframe with columns (
chrom
,start
,end
) representing genomic ranges (0-based, end-exclusive like BED format). All regions must be the same size.chrom
should be a string column;start
/end
should be numeric.cell_groups (list[int]) – List of pseudbulk groupings as created by
build_cell_groups()
bin_size (int) – Size for bins within each region given in basepairs. If the region width is not an even multiple of
resolution_bp
, then the last region may be truncated.
- Returns:
Numpy array with dimensions (region, psudobulks, position) and type numpy.int32
- Return type:
See also