ASAP - Automated Single-cell Analysis Portal

Release 7: 2023-11-08

Updated docker. New single-cell specific pipeline using Seurat.

Tool/DB	Version	Details	Tool type
ASAP_data PostgreSQL database	v6		Database
Gene Ontology	2020-Jan		Database
Ensembl vertebrates	104	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
Ensembl genomes	58	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
Human Cell Atlas Ontology	NA	Human Cell Atlas Ontology used to create gene sets.	Database
Flybase Anatomy Ontology	NA	Flybase Anatomy Ontology used to create gene sets.	Database
fabdavid/asap_run	v6	Python R Java	Docker

Release 6: 2022-04-12

Ensembl update.

Tool/DB	Version	Details	Tool type
ASAP_data PostgreSQL database	v6		Database
Gene Ontology	2020-Jan		Database
Ensembl vertebrates	104	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
Ensembl genomes	58	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
Human Cell Atlas Ontology	2020-06-01	Human Cell Atlas Ontology used to create gene sets.	Database
Flybase Anatomy Ontology	2021-12-09	Flybase Anatomy Ontology used to create gene sets.	Database
fabdavid/asap_run	v5	Python R Java	Docker

Release 5: 2020-02-12

Optimized asap_run docker, migration of v.2.7 python scripts to v.3.0.
Updated Ensembl data in asap_data database.

Tool/DB	Version	Details	Tool type
ASAP_data PostgreSQL database	v5		Database
Gene Ontology	2020-Jan		Database
Ensembl vertebrates	101	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
Ensembl genomes	58	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
fabdavid/asap_run	v5	Python R Java	Docker

Release 4: 2019-02-01

New version with docker, LOOM, HCA binding and more.

Tool/DB	Version	Details	Tool type
ASAP_data PostgreSQL database	v4		Database
Gene Ontology	2017-Jun		Database
KEGG	2016-Nov		Database
MSigDB from GSEA	2016-Nov		Database
Gene Atlas	2016-Nov		Database
Ensembl vertebrates	97	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
Ensembl genomes	58	MySQL tables are used to populate gene sets for Reactome, Drugbank and GO.	Database
fabdavid/asap_run	v4	Python R Java	Docker

Release 3: 2018-04-12

New version including:
-New trajectory visualization (Monocle)
-Project sharing
-Searchable list of projects / analyses

Tool/DB	Version	Details	Tool type
JSC [Java statistical API]	1.0		Java
scLVM	0.99.3	Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, Teichmann SA, Marioni JC & Stegle O, 2015. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-Sequencing data reveals hidden subpopulation of cells, Nature Biotechnology, doi: 10.1038/nbt.3102.	R
GPy	1.5.6	GPy is a Gaussian Process (GP) framework written in python, from the Sheffield machine learning group.	Python
limix	0.8.0.dev0	Limix is a flexible and efficient linear mixed model library with interfaces to Python.	Python
h5py	2.6.0	The h5py package is a Pythonic interface to the HDF5 binary data format.	Python
NA	0.18.1	NA	NA
numpy	1.11.2		Python
scipy	0.18.1		Python
Java JDK	1.8.0_111		Java
Python	2.7.5		Python
ComBat [sva package]	3.24.4	The sva package contains functions for removing batch effects and other unwanted variation in high-throughput experiment. Specifically, the sva package contains functions for the identifying and building surrogate variables for high-dimensional data sets. Surrogate variables are covariates constructed directly from high-dimensional data (like gene expression/RNA sequencing/methylation/brain imaging data) that can be used in subsequent analyses to adjust for unknown, unmodeled, or latent sources of noise. The sva package can be used to remove artifacts in three ways: (1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS), (2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and (3) removing batch effects with known control probes (Leek 2014 biorXiv). Removing batch effects and using surrogate variables in differential expression analysis have been shown to reduce dependence, stabilize error rate estimates, and improve reproducibility, see (Leek and Storey 2007 PLoS Genetics, 2008 PNAS or Leek et al. 2011 Nat. Reviews Genetics). Johnson, WE, Rabinovic, A, and Li, C (2007). Adjusting batch effects in microarray expression data using Empirical Bayes methods. Biostatistics 8(1):118-127.	R
R version & default packages [stats] (K-means, PCA, Hierarchical clustering, distance, correlation)	3.3.1		R
SC3	1.3.11	A tool for unsupervised clustering and analysis of single cell RNA-Seq data. Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A, Chandra T, Natarajan KN, Reik W, Barahona M, Green AR and Hemberg M (2016). “SC3 - consensus clustering of single-cell RNA-Seq data.” bioRxiv. doi: 10.1101/036558, http://biorxiv.org/content/early/2016/09/02/036558.	R
MDS [MASS Package]	7.3-45	Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002). Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0	R
PAM / Silhouette Plot [cluster package]	2.0.6	Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) "Finding Groups in Data". Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.(2016). cluster: Cluster Analysis Basics and Extensions.	R
ZIFA	0.1	Emma Pierson and Christopher YauEmail, ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis, Genome Biology, 16:241, 2015, DOI: 10.1186/s13059-015-0805-z	Python
Rtsne	0.13	An R wrapper around the fast T-distributed Stochastic Neighbor Embedding implementation by Van der Maaten (see <https://github.com/lvdmaaten/bhtsne/> for more information on the original implementation). L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008.	R
Pagoda / SCDE	2.5.0	The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734). Peter V Kharchenko, Lev Silberstein and David T Scadden, Bayesian approach to single-cell differential expression analysis, Nature Methods 11, 740–742 (2014) doi:10.1038/nmeth.2967	R
Limma / Voom	3.32.2	Data analysis, linear models and differential expression for microarray data. Charity W Law, Yunshun Chen, Wei Shi and Gordon K Smyth, voom: precision weights unlock linear model analysis tools for RNA-seq read counts, Genome Biology, 15:R29, 2014 DOI: 10.1186/gb-2014-15-2-r29	R
edgeR	3.18.1	Differential expression analysis of RNA-seq expression profiles with biological replication. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. As well as RNA-seq, it be applied to differential signal analysis of other types of genomic data that produce read counts, including ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE and CAGE. Mark D. Robinson, Davis J. McCarthy, and Gordon K. Smyth, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics. 2010 Jan 1; 26(1): 139–140. doi: 10.1093/bioinformatics/btp616. PMCID: PMC2796818	R
DESeq2	1.16.1	Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution. Love MI, Huber W and Anders S (2014). “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.” Genome Biology, 15, pp. 550. doi: 10.1186/s13059-014-0550-8.	R
SCAN / UPC	2.18.0	SCAN is a microarray normalization method to facilitate personalized-medicine workflows. Rather than processing microarray samples as groups, which can introduce biases and present logistical challenges, SCAN normalizes each sample individually by modeling and removing probe- and array-specific background noise using only data from within each array. SCAN can be applied to one-channel (e.g., Affymetrix) or two-channel (e.g., Agilent) microarrays. The Universal exPression Codes (UPC) method is an extension of SCAN that estimates whether a given gene/transcript is active above background levels in a given sample. The UPC method can be applied to one-channel or two-channel microarrays as well as to RNA-Seq read counts. Because UPC values are represented on the same scale and have an identical interpretation for each platform, they can be used for cross-platform data integration. Piccolo SR, Withers MR, Francis OE, Bild AH and Johnson WE (2013). “Multi-platform single-sample estimates of transcriptional activation.” Proceedings of the National Academy of Sciences of the United States of America, 110(44), pp. 17778-17783. doi: 10.1016/j.ygeno.2012.08.003.	R
data.table	1.10.4	Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development. https://github.com/Rdatatable/data.table/wiki	R
Gene Ontology	2017-Jun		Database
KEGG	2016-Nov		Database
MSigDB from GSEA	2016-Nov		Database
Gene Atlas	2016-Nov		Database
Ensembl	2017-Mar	Ensembl Database (GTF files)	Database

Release 2: 2017-07-09

New version including:
- changing reading and writing methods to the 'data.table' package to speed up processing of big datasets
- handling of 69 organisms (from all releases of Ensembl database)
- addition of alternative gene names from old Ensembl releases
- correction of minor bugs
- update of R and Python libraries

Tool/DB	Version	Details	Tool type
JSC [Java statistical API]	1.0		Java
scLVM	0.99.3	Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, Teichmann SA, Marioni JC & Stegle O, 2015. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-Sequencing data reveals hidden subpopulation of cells, Nature Biotechnology, doi: 10.1038/nbt.3102.	R
GPy	1.5.6	GPy is a Gaussian Process (GP) framework written in python, from the Sheffield machine learning group.	Python
limix	0.8.0.dev0	Limix is a flexible and efficient linear mixed model library with interfaces to Python.	Python
h5py	2.6.0	The h5py package is a Pythonic interface to the HDF5 binary data format.	Python
NA	0.18.1	NA	NA
numpy	1.11.2		Python
scipy	0.18.1		Python
Java JDK	1.8.0_111		Java
Python	2.7.5		Python
ComBat [sva package]	3.24.4	The sva package contains functions for removing batch effects and other unwanted variation in high-throughput experiment. Specifically, the sva package contains functions for the identifying and building surrogate variables for high-dimensional data sets. Surrogate variables are covariates constructed directly from high-dimensional data (like gene expression/RNA sequencing/methylation/brain imaging data) that can be used in subsequent analyses to adjust for unknown, unmodeled, or latent sources of noise. The sva package can be used to remove artifacts in three ways: (1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS), (2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and (3) removing batch effects with known control probes (Leek 2014 biorXiv). Removing batch effects and using surrogate variables in differential expression analysis have been shown to reduce dependence, stabilize error rate estimates, and improve reproducibility, see (Leek and Storey 2007 PLoS Genetics, 2008 PNAS or Leek et al. 2011 Nat. Reviews Genetics). Johnson, WE, Rabinovic, A, and Li, C (2007). Adjusting batch effects in microarray expression data using Empirical Bayes methods. Biostatistics 8(1):118-127.	R
R version & default packages [stats] (K-means, PCA, Hierarchical clustering, distance, correlation)	3.3.1		R
SC3	1.3.11	A tool for unsupervised clustering and analysis of single cell RNA-Seq data. Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A, Chandra T, Natarajan KN, Reik W, Barahona M, Green AR and Hemberg M (2016). “SC3 - consensus clustering of single-cell RNA-Seq data.” bioRxiv. doi: 10.1101/036558, http://biorxiv.org/content/early/2016/09/02/036558.	R
MDS [MASS Package]	7.3-45	Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002). Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0	R
PAM / Silhouette Plot [cluster package]	2.0.6	Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) "Finding Groups in Data". Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.(2016). cluster: Cluster Analysis Basics and Extensions.	R
ZIFA	0.1	Emma Pierson and Christopher YauEmail, ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis, Genome Biology, 16:241, 2015, DOI: 10.1186/s13059-015-0805-z	Python
Rtsne	0.13	An R wrapper around the fast T-distributed Stochastic Neighbor Embedding implementation by Van der Maaten (see <https://github.com/lvdmaaten/bhtsne/> for more information on the original implementation). L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008.	R
Pagoda / SCDE	2.5.0	The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734). Peter V Kharchenko, Lev Silberstein and David T Scadden, Bayesian approach to single-cell differential expression analysis, Nature Methods 11, 740–742 (2014) doi:10.1038/nmeth.2967	R
Limma / Voom	3.32.2	Data analysis, linear models and differential expression for microarray data. Charity W Law, Yunshun Chen, Wei Shi and Gordon K Smyth, voom: precision weights unlock linear model analysis tools for RNA-seq read counts, Genome Biology, 15:R29, 2014 DOI: 10.1186/gb-2014-15-2-r29	R
edgeR	3.18.1	Differential expression analysis of RNA-seq expression profiles with biological replication. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. As well as RNA-seq, it be applied to differential signal analysis of other types of genomic data that produce read counts, including ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE and CAGE. Mark D. Robinson, Davis J. McCarthy, and Gordon K. Smyth, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics. 2010 Jan 1; 26(1): 139–140. doi: 10.1093/bioinformatics/btp616. PMCID: PMC2796818	R
DESeq2	1.16.1	Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution. Love MI, Huber W and Anders S (2014). “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.” Genome Biology, 15, pp. 550. doi: 10.1186/s13059-014-0550-8.	R
SCAN / UPC	2.18.0	SCAN is a microarray normalization method to facilitate personalized-medicine workflows. Rather than processing microarray samples as groups, which can introduce biases and present logistical challenges, SCAN normalizes each sample individually by modeling and removing probe- and array-specific background noise using only data from within each array. SCAN can be applied to one-channel (e.g., Affymetrix) or two-channel (e.g., Agilent) microarrays. The Universal exPression Codes (UPC) method is an extension of SCAN that estimates whether a given gene/transcript is active above background levels in a given sample. The UPC method can be applied to one-channel or two-channel microarrays as well as to RNA-Seq read counts. Because UPC values are represented on the same scale and have an identical interpretation for each platform, they can be used for cross-platform data integration. Piccolo SR, Withers MR, Francis OE, Bild AH and Johnson WE (2013). “Multi-platform single-sample estimates of transcriptional activation.” Proceedings of the National Academy of Sciences of the United States of America, 110(44), pp. 17778-17783. doi: 10.1016/j.ygeno.2012.08.003.	R
data.table	1.10.4	Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development. https://github.com/Rdatatable/data.table/wiki	R
Gene Ontology	2017-Jun		Database
KEGG	2016-Nov		Database
MSigDB from GSEA	2016-Nov		Database
Gene Atlas	2016-Nov		Database
Ensembl	2017-Mar	Ensembl Database (GTF files)	Database

Release 1: 2016-11-14

This is the first version!

Tool/DB	Version	Details	Tool type
JSC [Java statistical API]	1.0		Java
scLVM	0.99.2	Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, Teichmann SA, Marioni JC & Stegle O, 2015. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-Sequencing data reveals hidden subpopulation of cells, Nature Biotechnology, doi: 10.1038/nbt.3102.	R
GPy	1.5.6	GPy is a Gaussian Process (GP) framework written in python, from the Sheffield machine learning group.	Python
limix	0.8.0.dev0	Limix is a flexible and efficient linear mixed model library with interfaces to Python.	Python
h5py	2.6.0	The h5py package is a Pythonic interface to the HDF5 binary data format.	Python
NA	0.18.1	NA	NA
numpy	1.11.2		Python
scipy	0.18.1		Python
Java JDK	1.8.0_111		Java
Python	2.7.5		Python
ComBat [sva package]	3.18.0	The sva package contains functions for removing batch effects and other unwanted variation in high-throughput experiment. Specifically, the sva package contains functions for the identifying and building surrogate variables for high-dimensional data sets. Surrogate variables are covariates constructed directly from high-dimensional data (like gene expression/RNA sequencing/methylation/brain imaging data) that can be used in subsequent analyses to adjust for unknown, unmodeled, or latent sources of noise. The sva package can be used to remove artifacts in three ways: (1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS), (2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and (3) removing batch effects with known control probes (Leek 2014 biorXiv). Removing batch effects and using surrogate variables in differential expression analysis have been shown to reduce dependence, stabilize error rate estimates, and improve reproducibility, see (Leek and Storey 2007 PLoS Genetics, 2008 PNAS or Leek et al. 2011 Nat. Reviews Genetics). Johnson, WE, Rabinovic, A, and Li, C (2007). Adjusting batch effects in microarray expression data using Empirical Bayes methods. Biostatistics 8(1):118-127.	R
R version & default packages [stats] (K-means, PCA, Hierarchical clustering, distance, correlation)	3.3.1		R
SC3	1.3.11	A tool for unsupervised clustering and analysis of single cell RNA-Seq data. Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A, Chandra T, Natarajan KN, Reik W, Barahona M, Green AR and Hemberg M (2016). “SC3 - consensus clustering of single-cell RNA-Seq data.” bioRxiv. doi: 10.1101/036558, http://biorxiv.org/content/early/2016/09/02/036558.	R
MDS [MASS Package]	7.3-45	Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002). Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0	R
PAM / Silhouette Plot [cluster package]	2.0.5	Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) "Finding Groups in Data". Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.(2016). cluster: Cluster Analysis Basics and Extensions.	R
ZIFA	0.1	Emma Pierson and Christopher YauEmail, ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis, Genome Biology, 16:241, 2015, DOI: 10.1186/s13059-015-0805-z	Python
Rtsne	0.11	An R wrapper around the fast T-distributed Stochastic Neighbor Embedding implementation by Van der Maaten (see <https://github.com/lvdmaaten/bhtsne/> for more information on the original implementation). L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008.	R
Pagoda / SCDE	1.99.4	The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734). Peter V Kharchenko, Lev Silberstein and David T Scadden, Bayesian approach to single-cell differential expression analysis, Nature Methods 11, 740–742 (2014) doi:10.1038/nmeth.2967	R
Limma / Voom	3.26.9	Data analysis, linear models and differential expression for microarray data. Charity W Law, Yunshun Chen, Wei Shi and Gordon K Smyth, voom: precision weights unlock linear model analysis tools for RNA-seq read counts, Genome Biology, 15:R29, 2014 DOI: 10.1186/gb-2014-15-2-r29	R
edgeR	3.12.1	Differential expression analysis of RNA-seq expression profiles with biological replication. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. As well as RNA-seq, it be applied to differential signal analysis of other types of genomic data that produce read counts, including ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE and CAGE. Mark D. Robinson, Davis J. McCarthy, and Gordon K. Smyth, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics. 2010 Jan 1; 26(1): 139–140. doi: 10.1093/bioinformatics/btp616. PMCID: PMC2796818	R
DESeq2	1.10.1	Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution. Love MI, Huber W and Anders S (2014). “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.” Genome Biology, 15, pp. 550. doi: 10.1186/s13059-014-0550-8.	R
SCAN / UPC	2.12.1	SCAN is a microarray normalization method to facilitate personalized-medicine workflows. Rather than processing microarray samples as groups, which can introduce biases and present logistical challenges, SCAN normalizes each sample individually by modeling and removing probe- and array-specific background noise using only data from within each array. SCAN can be applied to one-channel (e.g., Affymetrix) or two-channel (e.g., Agilent) microarrays. The Universal exPression Codes (UPC) method is an extension of SCAN that estimates whether a given gene/transcript is active above background levels in a given sample. The UPC method can be applied to one-channel or two-channel microarrays as well as to RNA-Seq read counts. Because UPC values are represented on the same scale and have an identical interpretation for each platform, they can be used for cross-platform data integration. Piccolo SR, Withers MR, Francis OE, Bild AH and Johnson WE (2013). “Multi-platform single-sample estimates of transcriptional activation.” Proceedings of the National Academy of Sciences of the United States of America, 110(44), pp. 17778-17783. doi: 10.1016/j.ygeno.2012.08.003.	R
Gene Ontology	2016-Nov		Database
KEGG	2016-Nov		Database
MSigDB from GSEA	2016-Nov		Database
Gene Atlas	2016-Nov		Database