R/simplify_signature.R
simplify_signature.Rd
Take a signature representation from SuperSig and group trinucleotides within each feature into interpretable labels, with optional IUPAC labeling from IUPAC_CODE_MAP in the Biostrings package
simplify_signature(object, iupac)
object | an object of class |
---|---|
iupac | logical value indicating whether to use IUPAC labels
(iupac = |
simplify_signature
returns a vector of
simplified features and their difference in mean
mean rates between exposed and unexposed (or
average rate if the factor is "age")
#> sample_id age chromosome position ref alt #> 1 1 50 chr1 94447621 G C #> 2 1 50 chr2 202005395 A C #> 3 1 50 chr7 20784978 T A #> 4 1 50 chr7 87179255 C G #> 5 1 50 chr19 1059712 G T #> 6 2 55 chr1 76226977 T Cinput_dt <- make_matrix(example_dt) # convert to correct format input_dt$IndVar <- c(1, 1, 1, 0, 0) # add IndVar column supersig <- get_signature(data = input_dt, factor = "Smoking")#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>simplify_signature(object = supersig, iupac = FALSE)#> C>A C>T T>C [C>G](ACT) [T>A](CTG) #> -1.286204e-04 -1.286204e-04 -1.256126e-04 -1.148120e-04 -1.058889e-04 #> [T>G](ACG) (ACT)[C>G]G (ATG)[T>A]A (CTG)[T>G]T #> -9.105143e-05 -9.967173e-06 -1.551465e-05 -2.743243e-05simplify_signature(object = supersig, iupac = TRUE)#> C>A C>T T>C [C>G]H [T>A]B #> -1.286204e-04 -1.286204e-04 -1.256126e-04 -1.148120e-04 -1.058889e-04 #> [T>G]V H[C>G]G D[T>A]A B[T>G]T #> -9.105143e-05 -9.967173e-06 -1.551465e-05 -2.743243e-05