- Fixed a bug that generating a wrong data type when only a sample is handled (#463). Thanks to @selkamand.
- Updated
sig_fit()
related documents for better usage (#454). - Added
cluster_col
toshow_group_enrichment()
. - Fixed the bug that error returned when
cluster_row = TRUE
&return_list = TRUE
in functionshow_group_enrichment()
. - Fixed the error in generating DBS and INDEL matrix when only one sample input (#453).
- Supported human T2T genome and corresponding annotation data.
- Updated COSMIC database to v3.4. SV and RNA-SBS signatures are included.
get_sig_db("latest_RNA-SBS_GRCh37")
get_sig_db("latest_SV_GRCh38")
- Fixed a bug in generating matrix for variation categories with strand bias due to problematic counting. (#445)
- Updated pkg doc following the new CRAN feature (thanks to K from the CRAN team).
- Added
samps
option toshow_sig_exposure()
.
Example:
load(system.file("extdata", "toy_mutational_signature.RData",
package = "sigminer", mustWork = TRUE
))
# Show signature exposure
p1 <- show_sig_exposure(sig2, rm_space = TRUE)
p1
expo = sig_exposure(sig2)
show_sig_exposure(expo,
rm_space = TRUE,
samps = colnames(expo)[order(colSums(expo))])
- Fixed the error in generating SBS matrix when only one sample input (#432).
- Removed package 'copynumber' from suggests filed.
- Supported Ziyu Tao et al approach for copy number segment classification.
- Supported ce11 genome in
read_vcf()
. - Added
read_maf_minimal()
to support a minimal MAF-like data as input.
- Fixed the issue about the latest CN signatures from COSMIC have inconsistent labels with built-in CN signatures (#421).
- Sorted substitution mutation types by default in
sig_tally()
. - Added parameter in
sigprofiler_extract()
to help generate input matrix file for calling SigProfiler directly. - Added some notions in
sigprofiler_extract()
. - Added a function
sigprofiler_reorder()
for utils in generating SigProfiler input matrix file with standard mutation types order.
- Fixed the bug about plotting CN chromosome distribution (#420, thanks to @jrcodina96).
- Updated COSMIC latest version from v3.2 to v3.3.
A new reference for copy number signature now is provided as
latest_CN_GRCh37
(#412). get_sig_similarity()
now uses "SBS" as default reference.- Fixed bug in
show_cn_circos()
. - Added
group_enrichment2()
.
- Fixed checking in
sig_tally()
.
- Added a vignette to introduce the analysis of copy number signatures.
- Updated
CNS_TCGA
. - Enhanced
group_enrichment()
with reference group support.
Example:
set.seed(1234)
df <- dplyr::tibble(
g1 = rep(LETTERS[1:3], c(50, 40, 10)),
g2 = rep(c("AA", "VV", "XX"), c(50, 40, 10)),
e1 = sample(c("P", "N"), 100, replace = TRUE),
e2 = rnorm(100)
)
x1 = group_enrichment(df, grp_vars = c("g1", "g2"),
enrich_vars = c("e1", "e2"),
ref_group = c("B", "VV"))
x1
- Added option for reading ASCAT objects in parallel.
- Fixed error in extracting invalid regions (#396, thanks to @KirsieMin).
- Enhanced the
read_copynumber_seqz()
to include minor copy number. (Thanks to yancey) - Added input
range
check insig_estimate()
. (#391)
- Expanded
output_*
function by adding optionsig_db
. - Fixed the error using
sigminer::get_genome_annotation()
before loading it. - Fixed the bug the
get_pLOH_score()
return nothing for sample without LOH.
- Added
sig_unify_extract()
as an unified signature extractor. - Fixed error showing reference signature profile for
CNS_TCGA
database.
- Impl
y_limits
option inshow_sig_profile()
(#381). - Added function
get_pLOH_score()
for representing the genome that displayed LOH. - Added function
read_copynumber_ascat()
for reading ASCAT result ASCAT object in.rds
format. - Added function
get_intersect_size()
for getting overlap size between intervals. - Added option to
get_Aneuploidy_score()
to remove short arms of chr13/14/15/21/22 from calculation.
- Implemented Cohen-Sharir method-like Aneuploidy Score.
- Enhanced error handling in
show_sig_feature_corrplot()
(#376). - Fixed INDEL classification.
- Fixed end position determination in
read_vcf()
. - Updated INDEL adjustment.
- Included TCGA copy number signatures from SigProfiler.
- Updated docs.
- Preprocessed INDELs before labeling them in
sig_tally()
(#370). - Fixed
sigprofiler_extract()
extracting copy number signatures and rolled up sigprofiler version (#369).
- Fixed
output_sig()
error in handling exposure plot with >9 signatures (#366). - Added
limitsize = FALSE
forggsave()
orggsave2()
for handling big figure.
- Supported
mm9
genome build. - Removed FTP link as CRAN suggested (#359).
- Updated README.
- Fixed the SigProfiler installation error due to Python version in conda environment.
- Fixed classification bug due to repeated function name
call_component
. - Fixed the bug when
read_vcf()
with##
commented VCF files.
- Added support for latest COSMIC v3.2 as reference signatures. You can obtain them by
for (i in c("latest_SBS_GRCh37", "latest_DBS_GRCh37", "latest_ID_GRCh37",
"latest_SBS_GRCh38", "latest_DBS_GRCh38",
"latest_SBS_mm9", "latest_DBS_mm9",
"latest_SBS_mm10", "latest_DBS_mm10",
"latest_SBS_rn6", "latest_DBS_rn6")) {
message(i)
get_sig_db(i)
}
- Updated
keep_only_pass
toFALSE
at default. - Added RSS and unexplained variance calculation in
get_sig_rec_similarity()
. - Added data check and filter in
output_tally()
andshow_catalogue()
. - Enhanced
show_group_enrichment()
(#353) & added a new option to cluster rows. - Removed unnecessary CN classifications code in recent development.
- Dropped copy number "M"" method to avoid misguiding user to use/read wrong signature profile and keep code simple.
- Modified the default visualization of
bp_show_survey()
. - Enhanced
torch
check.
read_sv_as_rs()
andsig_tally.RS()
for simplified genome rearrangement classification matrix generation (experimental).
- Fixed the assign problem about match pair in
bp_extract_signatures()
withlpSolve
package instead of using my problematic code.
- Supported
mm10
inread_vcf()
. - Removed large data files and store them in Zenodo to reduce package size.
- Added cores check.
- Upgraded SP to v1.1.0 (need test).
- Tried installing Torch before SP (need test).
- Fixed bug in silhouette calculation in
bp_extract_signatures()
(#332). PAY ATTENTION: this may affect results. - Fixed bug using custom signature name in
show_sig_profile_loop()
.
- Subset signatures to plot is available by
sig_names
option. - sigminer is available in bioconda channel:
https://anaconda.org/bioconda/r-sigminer/
- Updated
ms
strategy insig_auto_extract()
by assigning each signature to its best matched reference signatures. - Added
get_shannon_diversity_index()
to get diversity index for signatures (#333). - Added new method "S" (from Steele et al. 2019) for tallying copy number data (#329).
- Included new (RS) reference signatures (related to #331).
- Updated the internal code for getting relative activity in
get_sig_exposure()
.
bp_get_clustered_sigs()
to get clustered mean signatures.
- Updated author list.
- Added a quick start vignette.
- A new option
highlight
is added toshow_sig_number_survey()
andbp_show_survey2()
to highlight a selected number.
- A new option
cut_p_value
is added toshow_group_enrichment()
to cut continous p values as binned regions. - A Python backend for
sig_extract()
is provided. - User now can directly use
sig_extract()
andsig_auto_extract()
instead of loading NMF package firstly. - Added benchmark results for different extraction approaches in README.
- The threshold for
auto_reduce
insig_fit()
is modified from 0.99 to 0.95 and similarity update threshold updated from>0
to>=0.01
. - Removed
pConstant
option fromsig_extract()
andsig_estimate()
. Now a auto-check function is created for avoiding the error from NMF package due to no contribution of a component in all samples.
bp_show_survey2()
to plot a simplified version for signature number survey (#330).read_xena_variants()
to read variant data from UCSC Xena as aMAF
object for signature analysis.get_sig_rec_similarity()
for getting reconstructed profile similarity forSignature
object (#293).- Added functions start with
bp_
which are combined to provide a best practice for extracting signatures in cancer researches. See more details, run?bp
in your R console.
- Added data simulation.
- Suppressed
future
warnings. - Fixed p value calculation in bootstrap analysis.
- Fixed typo in
show_cor()
, thanks to @Miachol. - Added
y_tr
option inshow_sig_profile()
to transform y axis values. - Optimized default behavior of
read_copynumber()
.- Support LOH records when user input minor allele copy number.
- Set
complement = FALSE
as default. - Free dependencies between option
use_all
andcomplement
.
- Added visualization support for genome rearrangement signatures (#300).
- Added four database for reference signatures from https://doi.org/10.1038/s43018-020-0027-5 (#299).
- Added new measure 'CV' for
show_sig_bootstrap()
(#298). - Added
group_enrichment()
andshow_group_enrichment()
(#277). - Optimized signature profile visualization (#295).
- Updated
?sigminer
documentation. - Added
ms
strategy to select optimal solution by maximizing cosine similarity to reference signatures. - Added
same_size_clustering()
for same size clustering. - Added
show_cosmic()
to support reading COSMIC signatures in web browser (#288). - Changed argument
rel_threshold
behavior insig_fit()
andget_sig_exposure()
. Made them more consistent and allowed un-assigned signature contribution (#285). - Updated all COSMIC signatures to v3.1 and their aetiologies (#287).
- Added more specific reference signatures from SigProfiler, e.g.
SBS_mm9
. - Supported
data.frame
as input object forsig
inget_sig_similarity()
andsig_fit()
. - Modified
g_label
option inshow_group_distribution()
to better control group names. - Added
test
option and variable checking inshow_cor()
. - Updated
output_sig()
to output signature exposure distribution (#280). - Added
show_cor()
for general association analysis. - Added options in
show_group_distribution()
to control segments.
- Fixed bugs when outputing only 1 signatures.
- Fixed label inverse bug in
add_labels()
, thanks to TaoTao for reporting.
- Handled
,
seperated indices in show_cosmic_signatures. - Added option
set_order
inget_sig_similarity()
(#274). - Outputed more stats information in
output_sig()
. - Fixed default y axis title in
show_sig_bootstrap_error()
, now it is "Reconstruction error (L2 norm)"
- Added
auto_reduce
option insig_fit*
functions to improve signature fitting. - Return cosine similarity for sample profile in
sig_fit()
. - Set default strategy in
sig_auto_extract()
to 'optimal'. - Supported search reference signature index in
get_sig_cancer_type_index()
. - Outputed legacy COSMIC similarity for SBS signatures.
- Added new option in
sigprofiler_extract()
to reduce failure in whenrefit
is enabled. - Outputed both relative and absolute signature exposure in
output_sig()
. - Updated background color in
show_group_distribution()
. - Modified the default theme for signature profile in COSMIC style.
- Updated the copy number classification method.
- Handled null catalogue.
- Supported ordering the signatures for results from SigProfiler.
- Supported importing refit results from SigProfiler.
- Set
optimize
option insig_extract()
andsig_auto_extract()
.
- Supported signature index separated by
,
insig_fit()
andsig_fit_bootstrap*
functions. - Added
output_*
functions from sigflow. - Enhanced DBS search and error handling in
sig_tally()
. - Added option
highlight_genes
inshow_cn_group_profile()
to show gene labels. - Added
get_sig_cancer_type_index()
to get reference signature index. - Added
show_group_distribution()
to show group distribution. - Added options in
show_cn_profile()
to show specified ranges and add copy number value labels. - Used package
nnls
instead ofpracma
for NNLS implementation insig_fit()
.
- Supported
BSgenome.Hsapiens.1000genomes.hs37d5
insig_tally()
. - Remove changing
MT
toM
in mutation data. - Fixed bug in extract numeric signature names and signature orderings in
show_sig_exposure()
. - Added
letter_colors
as an unexported discrete palette.
- Added
transform_seg_table()
. - Added
show_cn_group_profile()
. - Added
show_cn_freq_circos()
. sig_orders
option inshow_sig_profile()
function now can select and order signatures to plot.- Added
show_sig_profile_loop()
for better signature profile visualization.
- Added option to control the SigProfilerExtractor to avoid issue in docker image build.
- Some updates.
- Compatible with SigProfiler 1.0.15
- Tried to speed up joining adjacent segments in
read_copynumber()
, got 200% improvement.
- Tried to speed up joining adjacent segments in
read_copynumber()
, got 20% improvement. - Added
cosine()
function. - Added and exported
get_sig_db()
to let users directly load signature database. - Added
sigprofiler_extract()
andsigprofiler_import()
to call SigProfiler and import results. - Added
read_vcf()
for simply reading VCF files. - Implemented DBS-1248.
- Added
show_sig_profile_heatmap()
. - Supported mouse genome 'mm10' (#241).
- Added
read_copynumber_seqz()
to read sequenza result directory. - Speed up the annotation process in
read_copynumber()
.
- Fixed bug in OsCN feature calculation.
- Removed useless options in
read_maf()
. - Modify method 'LS' in
sig_fit()
to 'NNLS' and implement it with pracma package (#216). - Made
use_all
option inread_copynumber()
working correctly. - Fixed potential problem raised by unordered copy number segments (#217).
- Fixed a typo, correct
MRSE
toRMSE
. - Added feature in
show_sig_bootstrap_*()
for plotting aggregated values. - Fixed bug when use
get_groups()
for clustering. - Fixed bug about using reference components from NatGen 2018 paper.
- Added option
highlight_size
forshow_sig_bootstrap_*()
. - Fixed bug about signature profile plotting for method 'M'.
- Added "scatter" in
sig_fit()
function to better visualize a few samples. - Added "highlight" option.
lsei
package was removed from CRAN, here I reset default method to 'QP' and tried best to keep the LS usage in sigminer (#189).- Made consistent copy number labels in
show_sig_profile()
and added input checking for this function. - Fixed unconsistent bootstrap when use
furrr
, solution is from futureverse/furrr#107. - Properly handled null-count sample in
sig_fit()
for methodsQP
andSA
. - Supported boxplot or violin in
show_sig_fit()
andshow_sig_bootstrap_*
functions. - Added job mode for
sig_fit_bootstrap_batch
for more useful in practice. - Added
show_groups()
to show the signature contribution in each group fromget_groups()
. - Expanded clustering in
get_groups()
to result ofsig_fit()
. - Properly handled null-count samples in
sig_fit_bootstrap_batch()
. - Added strand bias labeling for INDEL.
- Added COSMIC TSB signatures.
- Exported APOBEC result when the mode is 'ALL' in
sig_tally()
. - Added batch bootstrap analysis feature (#158).
- Supported all common signature plotting.
- Added strand feature to signature profile.
- Added profile plot for DBS and INDEL.
- Fixed error for signature extraction in mode 'DBS' or 'ID'.
- Fixed method 'M' for CN tally cannot work when
cores > 1
(#161).
- Added multiple methods for
sig_fit()
. - Added feature
sig_fit_bootstrap()
for bootstrap results. - Added multiple classification method for SBS signature.
- Added strand bias enrichment analysis for SBS signature.
- Moved multiple packages from field
Imports
toSuggests
. - Added feature
report_bootstrap_p_value()
to report p values. - Added common DBS and ID signature.
- Updated citation.
- Added merged transcript info for hg19 and hg38 build, this is availabe by
data()
. - Added gene info for hg19 and hg38 build to extdata directory.
- Removed
fuzzyjoin
package from dependency. - Moved
ggalluvial
package to fieldsuggsets
.
All users, this is a break-through version of sigminer, most of functions have been modified, more features are implemented. Please read the reference list to see the function groups and their functionalities.
Please read the vignette for usage.
I Hope it helps your research work and makes a new contribution to the scientific community.