Skip to contents

Given a features x cells matrix, perform one-vs-all differential tests to find markers.

Usage

marker_features(mat, groups, method = "wilcoxon")

Arguments

mat

IterableMatrix object of dimensions features x cells

groups

Character/factor vector of cell groups/clusters. Length #cells

method

Test method to use. Current options are:

  • wilcoxon: Wilconxon rank-sum test a.k.a Mann-Whitney U test

Value

tibble with the following columns:

  • foreground: Group ID used for the foreground

  • background: Group ID used for the background (or NA if comparing to rest of cells)

  • feature: ID of the feature

  • p_val_raw: Unadjusted p-value for differential test

  • foreground_mean: Average value in the foreground group

  • background_mean: Average value in the background group

Details

Tips for using the values from this function:

  • Use dplyr::mutate() to add columns for e.g. adjusted p-value and log fold change.

  • Use dplyr::filter() to get only differential genes above some given threshold

  • To get adjusted p-values, use R p.adjust(), recommended method is "BH"

  • To get log2 fold change: if your input matrix was already log-transformed, calculate (foreground_mean - background_mean)/log(2). If your input matrix was not log-transformed, calculate log2(forground_mean/background_mean)

Examples

mat <- get_demo_mat()
groups <- sample(c("A", "B", "C", "D"), ncol(mat), replace = TRUE)
marker_feats <- marker_features(mat, groups)
#> Warning: marker features calculation requires row-major storage
#>  Consider using transpose_storage_order() if running marker_features repeatedly
#> This message is displayed once every 8 hours.
#> Writing transposed storage order to /tmp/RtmpsGFdDm/transpose1588dc92eef2

## to see the results of one specific group vs all other groups
marker_feats %>% dplyr::filter(foreground == "A")
#> # A tibble: 3,582 × 6
#>    foreground background feature       p_val_raw foreground_mean background_mean
#>    <chr>      <chr>      <chr>             <dbl>           <dbl>           <dbl>
#>  1 A          NA         ENSG00000272…     0.130          0.0275         0.0427 
#>  2 A          NA         ENSG00000250…     0.886          0.136          0.143  
#>  3 A          NA         ENSG00000275…     0.412          0              0.00103
#>  4 A          NA         ENSG00000186…     1              0              0      
#>  5 A          NA         ENSG00000286…     0.389          0.0107         0.00771
#>  6 A          NA         ENSG00000131…     0.347          0.113          0.131  
#>  7 A          NA         ENSG00000281…     0.657          0.0183         0.0211 
#>  8 A          NA         ENSG00000272…     1              0              0      
#>  9 A          NA         ENSG00000182…     0.148          0.359          0.304  
#> 10 A          NA         ENSG00000174…     0.832          0.111          0.111  
#> # ℹ 3,572 more rows

## get only differential genes given a threshold value
marker_feats %>% dplyr::filter(p_val_raw < 0.05)
#> # A tibble: 473 × 6
#>    foreground background feature       p_val_raw foreground_mean background_mean
#>    <chr>      <chr>      <chr>             <dbl>           <dbl>           <dbl>
#>  1 B          NA         ENSG00000178…   0.0360          0.0180          0.00931
#>  2 A          NA         ENSG00000163…   0.0436          0.0748          0.102  
#>  3 C          NA         ENSG00000159…   0.0380          0.205           0.145  
#>  4 A          NA         ENSG00000125…   0.0429          0.00763         0.0175 
#>  5 B          NA         ENSG00000159…   0.0238          0.0616          0.0982 
#>  6 D          NA         ENSG00000159…   0.0484          0.120           0.0787 
#>  7 B          NA         ENSG00000248…   0.0160          0.00300         0      
#>  8 C          NA         ENSG00000173…   0.00666         0               0.0143 
#>  9 B          NA         ENSG00000013…   0.00435         0.168           0.113  
#> 10 A          NA         ENSG00000246…   0.0221          0.0260          0.0123 
#> # ℹ 463 more rows