[ Home ] [ Data Simulator ] [ Data Analysis Tool ] [ Dataset Links ] [ Recommendations
for Analysis ]
GEDA
References for Microarray Analysis
suggest a paper
===========================================================================
I. Data Format/Data Storage
- Brazma A, et al.: One Stop
Shopping for Microarrays: Is a universal, public DNA microarray database a
realistic goal? Nature 2000, 403:699-700.[Full
text]
- Spellman P, et al.: Design
and implementation of microarray gene expression markup language
(MAGE-ML).Genome Biology, 23 August 2002. [Full Text]
I. Experimental Design
- Black M.A. and RW Doerge.
2002. Calculation of the minimum number of replicate spots required for
detection of significant gene expression fold change in microarray
experiments. Bioinformatics 18(12):1609-1616.[PubMed]
- Churchill GA. 2002.
Fundamentals of experimental design for cDNA microarrays. Nat Genet 32
Suppl 2,490-5.[Pubmed]
- Dash A, Maine IP, Varambally
S, Shen R, Chinnaiyan AM, Rubin MA. 2002. Changes in differential gene
expression because of warm ischemia time of radical prostatectomy specimens.
Am J Pathol 161,1743-8.[Pubmed]
- Dobbin K, Simon R. 2002.
Comparison of microarray designs for class comparison and class discovery.
Bioinformatics 18(11):1438-45.[Pubmed]
- Emptage MR, Hudson-Curtis B,
Sen K. 2003. Treatment of microarray experiments as split-plot designs. J
Biopharm Stat. 2003 May;13(2):159-78.[Pubmed]
- Lee M-LT, Whitmore GA,
Yukhananov RY. Analysis of unbalanced microarray data. Journal of Data
Science 2003, 1:103-121. [JDS Full Text]
- Herwig R, Aanstad P, Clark
M, Lehrach H. 2001 Statistical evaluation of differential expression on
cDNA nylon arrays with replicated experiments. Nucleic Acids Res 29,E117[Pubmed]
- Huang J, Qi R, Quackenbush
J, Dauway E, Lazaridis E, Yeatman T. (2001) Effects of ischemia on gene
expression. J Surg Res 99,222-7.[Pubmed]
- Kendziorski, C.M., M.A.
Newton, H. Lan, and M.N. Gould. 2003. On parametric empirical Bayes
methods for comparing multiple groups using replicated gene expression
profiles. Statistics in Medicine, to appear.[Pubmed]
- Kerr MK. Experimental design
to make the most of microarray studies. Methods Mol Biol.
2003;224:137-47.[Pubmed]
- Kerr MK, Churchill GA. 2001.
Statistical design and the analysis of gene expression microarray data.
Genet Res. 77,123-8.[Pubmed]
- Kerr MK, Churchill GA: Experimental
design for gene expression microarrays. Biostatistics 2:183-201.[Pubmed]
- Lee ML, Kuo FC, Whitmore GA,
Sklar J. 2000. Importance of replication in microarray gene expression
studies, statistical methods and evidence from repetitive cDNA
hybridizations. Proc Natl Acad Sci U S A 97,9834-9.[Pubmed]
- Lee ML, Whitmore GA. 2002.
Power and sample size for DNA microarray studies. Stat Med 21,3543-70.[Pubmed]
- Li C, Hung Wong W. 2001.
Model-based analysis of oligonucleotide arrays, model validation, design
issues and standard error application. PNAS 98,31-36.[Pubmed]
- Liang M, Briggs AG, Rute E,
Greene AS, Cowley Jr AW.2003. Quantitative assessment of the importance of
dye switching and biological replication in cDNA microarray studies.
Physiol Genomics. 2003 Jun 10 [Epub ahead of print].[Pubmed]
- Lönnstedt, I , and T. Speed.
2002. Replicated microarray data, Statistica Sinica, 12 (1) ,
31-46.[Pubmed]
- Pan W, Lin J, Le CT. How
many replicates of arrays are required to detect gene expression changes
in microarray experiments? A mixture model approach. Genome Biol
2002;3(5):research0022[Pubmed]
- Peng X, Wood CL, Blalock EM,
Chen KC, Landfield PW, Stromberg AJ. 2003. Statistical implications of
pooling NA samples for microarray experiments. BMC Bioinformatics ;4(1):26
[Pubmed]
- Wang J, Nygaard V,
Smith-Sorensen B, Hovig E, Myklebost O. (2002) MArray: analysing single,
replicated or reversed microarray experiments. Bioinformatics 18,1139-40.[Pubmed]
- Simon RM, Dobbin K. Experimental
design of DNA microarray experiments. Biotechniques. 2003
Mar;Suppl:16-21.[Pubmed]
- Simon R, Radmacher MD,
Dobbin K, McShane LM. 2003. Pitfalls in the use of DNA microarray data for
diagnostic and prognostic classification. J Natl Cancer Inst. 2003 Jan
1;95(1):14-8.[Pubmed]
- Yang YH, Speed T. 2002.
Design issues for cDNA microarray experiments. Nat Rev Genet 3,579-88.[Pubmed]
- Wang Y, Wang X, Guo SW,
Ghosh S. 2002. Conditions to ensure competitive hybridization in two-color
microarray, a theoretical and experimental analysis. Biotechniques
32,1342-6.[Pubmed]
- Wrobel G, Schlingemann J,
Hummerich L, Kramer H, Lichter P, Hahn M. 2003. Optimization of
high-density cDNA-microarray protocols by 'design of experiments'. Nucleic
Acids Res.31(12):e67. [Pubmed]
II. Steps for Data
Handling/Normalization/Transformation
- Bilban M, Buehler LK, Head S,
Desoye G, Quaranta V. 2002a. Normalizing DNA microarray data. Curr Issues
Mol Biol 4,57-64.[Pubmed]
- Bolstad BM, Irizarry RA,
Astrand M, Speed TP. 2003. A comparison of normalization methods for high
density oligonucleotide array data based on variance and bias.
Bioinformatics 19:185-93.[Pubmed]
- Cheadle C, Vawter MP, Freed
WJ, Becker KG. 2003. Analysis of microarray data using z score
transformation. J Mol Diagn. May;5(2):73-81.[Pubmed]
- Chen YJ, Kodell R, Sistare F,
Thompson KL, Morris S, Chen JJ. Normalization methods for analysis of
microarray gene-expression data. J Biopharm Stat. 2003 Feb;13(1):57-74.[Pubmed]
- Colantuoni C, Henry G, Zeger
S, Pevsner J. 2002a. Local mean normalization of microarray element signal
intensities across an array surface, quality control and correction of
spatially systematic artifacts. Biotechniques 32,1316-20.[Pubmed]
- Chen, YJ, R Kodell, F Sistare, KL Thompson, S Morris JJ Chen 2003. Normalization methods for analysis of microarray gene-expression data. J Biopharm Stat 13(1): 57-74.[Pubmed]
- Colantuoni C, Henry G, Zeger
S, Pevsner J. 2002b. SNOMAD (Standardization and NOrmalization of
MicroArray Data), web-accessible gene expression data analysis.
Bioinformatics 18,1540-1541.[Pubmed]
- Durbin BP, Hardin JS, Hawkins
DM, Rocke DM. (2002) A variance-stabilizing transformation for
gene-expression microarray data. Bioinformatics 18 Suppl 1,S105-10.[Pubmed]
- Edwards D. Non-linear
normalization and background correction in one-channel cDNA microarray
studies. Bioinformatics. 2003 May 1;19(7):825-33.[Pubmed]
- Hill AA, Brown EL, Whitley
MZ, Tucker-Kellogg G, Hunter CP, Slonim DK. 2001. Evaluation of
normalization procedures for oligonucleotide array data based on spiked
cRNA controls. Genome Biol 2,RESEARCH0055 [Pubmed]
- Hoffmann R, Seidl T, Dugas M.
2002. Profound effect of normalization on detection of differentially
expressed genes in oligonucleotide microarray data analysis. Genome Biol
3,RESEARCH0033[Pubmed]
- Hsiao, LL, RV Jensen, T Yoshida, KE Clark, JE Blumenstock SR Gullans 2002. Correcting for signal saturation errors in the analysis of microarray data. Biotechniques 32(2): 330-2, 4, 6.[Pubmed]
- Huber W, Von Heydebreck A,
Sultmann H, Poustka A, Vingron M. (2002) Variance stabilization applied to
microarray data calibration and to the quantification of differential
expression. Bioinformatics 18 Suppl 1,S96-S104.[Pubmed]
- Irizarry, RA, B Hobbs, F Collin, YD
Beazer-Barclay, KJ Antonellis, U Scherf TP Speed 2003. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2): 249-64.[Pubmed]
- Kepler TB, Crosby L, Morgan
KT. (2002) Normalization and analysis of DNA microarray data by
self-consistency and local regression. Genome Biol 3,research0037.1-0037.12.[Pubmed]
- Kim JH, Shin DM, Lee YS.
(2002) Effect of local background intensities in the normalization of cDNA
microarray data with a skewed expression profiles. Exp Mol Med 34,224-32.[Pubmed]
- Kroll TC, Wolfl S. 2002.
Ranking: a closer look on globalisation methods for normalisation of gene
expression arrays. Nucleic Acids Res. 2002 Jun 1;30(11):e50.[Pubmed]
- Park T, Yi SG, Kang SH, Lee
S, Lee YS, Simon R 2003. Evaluation of Normalization Methods for
Microarray Data.BMC Bioinformatics. 2003 Sep 2 [Epub ahead of print]. Epub
2003 Sep 02. [Pubmed]
- Qian, J. Y. Kluger, H. Yu and
M. Gerstein. 2003. Identification and correction of spurious correlations
in microarray data. Biotechniques 35:42-48.
- Quackenbush J. (2002)
Microarray data normalization and transformation. Nat Genet 32
Suppl,496-501.[Pubmed]
- Rocke DM, Durbin B.
2003. Approximate variance-stabilizing transformations for gene-expression
microarray data.Bioinformatics. May 22;19(8):966-72. [Pubmed]
- Rudi K, Treimo J, Moen B, Rud
I, Vegarud G.200. Internal controls for normalizing DNA arrays.
Biotechniques. 2002 Sep;33(3):496, 498, 500 passim [Pubmed].
- Schageman, JJ, M. Basit, TD
Gallardo, HR Garner and RV Shohet. 2002. MarcC-V, a spreadsheet-based tool
for analysis, normalization, and visualization of single cDNA microarray
experiments. Biotechniques 32,338-340, 342, 344.[Pubmed]
- Schadt EE, Li C, Ellis B,
Wong WH. Feature extraction and normalization algorithms for high-density
oligonucleotide gene expression array data. J Cell Biochem Suppl.
2001;Suppl 37:120-5.[Pubmed]
- Schuchhardt J, Beule D, Malik
A, Wolski E, Eickhofin H, Lehrach H, Herzel H. Normalization strategies
for cDNA microarrays. Nucleic Acids Research 2000; 28(10):e47. [Pubmed]
- Shmulevich I, Zhang W.2002.
Binary analysis and optimization-based normalization of gene expression
data. Bioinformatics. 2002 Apr;18(4):555-65. [Pubmed]
- Tseng GC, Oh MK, Rohlin L,
Liao JC, Wong WH. (2001) Issues in cDNA microarray analysis: quality
filtering, channel normalization, models of variations and assessment of
gene effects. Nucleic Acids Res. 29,2549-57.[Pubmed]
- Vandesompele J, De Preter K,
Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F. (2002) Accurate
normalization of real-time quantitative RT-PCR data by geometric averaging
of multiple internal control genes. Genome Biol. 18;3(7),37[Pubmed]
- Wang Y, Lu J, Lee R, Gu Z,
Clarke R. 2002. Iterative normalization of cDNA microarray data. IEEE
Trans Inf Technol Biomed. Mar;6(1):29-37. [Pubmed]
- Workman C, Jensen LJ, Jarmer
H, Berka R, Gautier L, Nielser HB, Saxild HH, Nielsen C, Brunak S, Knudsen
S. 2002. A new non-linear normalization method for reducing variability in
DNA microarray experiments. Genome Biol 3,research0048[Pubmed]
- Yang YH, Dudoit S, Luu P, Lin
DM, Peng V, Ngai J, Speed TP. (2002) Normalization for cDNA microarray
data: a robust composite method addressing single and multiple slide
systematic variation. Nucleic Acids Res 15;30(4),e15.[Pubmed]
- Yeung, K. Y. C. Fraley, A.
Murua, A. E. Raftery, and W. L. Ruzzo. Model-based clustering and data
transformations for gene expression data. Bioinformatics 17:977--987,
2001.[Pubmed]
- Cope, LM, RA Irizarry, HA
Jaffee, Z Wu TP Speed 2003. A Benchmark for Affymetrix GeneChip Expression Measures. Bioinformatics 1(1): 1-10.[Pubmed]
Eisen, MB, PT Spellman, PO Brown D Botstein 1998. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95(25): 14863-8.[Pubmed]
III. Tests for Differentially Expressed
Genes
1. General
Overview/Comparisons
- Baggerly KA, Coombes
KR, Hess KR, Stivers DN, Abruzzo LV, Zhang W. 2001. Identifying differentially
expressed genes in cDNA microarray experiments. J Comput Biol 8,639-59.[Pubmed]
- Broberg P. Ranking
genes with respect to differential expression. Genome Biol 2002 Aug
5;3(9):preprint0007[Pubmed]
- Kooperberg C, Sipione
S, LeBlanc M, Strand AD, Cattaneo E, Olson JM. 2002. Evaluating test
statistics to select interesting genes in microarray experiments. Hum Mol
Genet 11,2223-32.[Pubmed]
- Storey JD, Tibshirani
R. 2003. Statistical methods for identifying differentially expressed
genes in DNA microarrays. Methods Mol Biol. 224:149-57. [Pubmed]
- Zhang, M.Q. 1999.
Large-scale gene expression data analysis: a new challenge to
computational biologists. Genome Res. 9: 681-688[Pubmed]
2. Least Squares
Methods
- Bushel PR, Hamadeh HK,
Bennett L, Green J, Ableson A, Misener S, Afshari CA, Paules RS.
2002.Computational selection of distinct class- and subclass-specific gene
expression signatures. J Biomed Inform. 2002
Jun;35(3):160-70. [Pubmed]
- Cui X, Churchill GA.
2003. Statistical tests for differential expression in cDNA microarray
experiments. Genome Biol. 2003;4(4):210. Epub 2003 Mar 17. [Pubmed]
- Draghici S, Kulaeva O,
Hoff B, Petrov A, Shams S, Tainsky MA. Noise sampling method: an ANOVA
approach allowing robust selection of differentially regulated genes
measured by DNA microarrays. Bioinformatics. 2003 Jul 22;19(11):1348-59.
[Pubmed]
- Welford SM, Gregg J,
Chen E, Garrison D, Sorensen PH, Denny CT, Nelson SF. 1998. Detection of
differentially expressed genes in primary tumor tissues using
representational differences analysis coupled to microarray
hybridization. Nucleic Acids Res 26, 3059-65.[Pubmed]
- Thomas JG, Olson JM, Tapscott
SJ, Zhao LP. (2001) An
efficient and robust statistical modeling approach to discover differentially
expressed genes using genomic expression profiles. Genome Res 11,1227-36.[Pubmed]
- Yang IV, Chen E,
Hasseman JP, Liang W, Frank BC, Wang S, Sharov V, Saeed AI, White J, Li
J, Lee NH, Yeatman TJ, Quackenbush J. (2002) Within the fold: assessing differential
expression measures and reproducibility in microarray assays. Genome
Biol. 24;3.62.[Pubmed]
3. Nonparametric
Methods
- Pan W. 2003. On the use of permutation in and
the performance of a class of nonparametric methods to detect
differential gene expression. Bioinformatics. 2003 Jul
22;19(11):1333-40. [Pubmed]
- Huang X, Pan W. 2002.
Comparing three methods for variance estimation with duplicated high
density oligonucleotide arrays. Funct Integr Genomics. 2002
Aug;2(3):126-33. Epub 2002 Jul 24.
[Pubmed]
- Park PJ, Pagano M,
Bonetti M.2001 A nonparametric scoring algorithm for identifying
informative genes from microarray data.Pac Symp Biocomput:52-63[Pubmed]
- Troyanskaya OG, Garber
ME, Brown PO, Botstein D, Altman RB.2002 Nonparametric methods for
identifying differentially expressed genes in microarray
data.Bioinformatics 2002 Nov;18(11):1454-61[Pubmed]
- Li C, Wong WH. 2001
Model-based analysis of oligonucleotide arrays: expression index
computation and outlier detection. Proc Natl Acad Sci U S A. 98(1):31-6.[Pubmed]
4. False Discovery
Rate Estimation
- Efron B, Tibshirani
R. Empirical bayes methods and false discovery rates for microarrays.
Genet Epidemiol 2002 Jun;23(1):70-86[Pubmed]
- Storey, J. (2002) A
direct approach to false discovery rates. J. Roy. Stat. Soc. Ser. B,
64:479-498.[Pubmed]
- Reiner A, Yekutieli
D, Benjamini Y. 2003. Identifying differentially expressed genes using
false discovery rate controlling procedures. Bioinformatics
Feb;19(3):368-75.[Pubmed]
- Tusher VG,
Tibshirani R, Chu G. 2001. Significance analysis of microarrays applied
to the ionizing radiation response. Proc Natl Acad Sci USA 98,5116-21.[Pubmed]
- Westfall, PH and
Young, SS 1989. P-value adjustments for multiple tests in multivariate
binomial models, Journal of the American Statistical Association, 84, 780
-786. [Pubmed - no entry]
5. Multivariate Analysis
- Peterson LE. 2003. Partitioning large-sample microarray-based gene
expression profiles using principal components analysis. Comput Methods
Programs Biomed. 2003 Feb;70(2):107-19. [Pubmed]
Model-Based Analysis
6. Likelihood Models
- Ideker, T., Thorsson,
V., Siegel, A.F., and Hood, L.E. 2000. Testing for
differentially-expressed genes by maximum-likelihood analysis of
microarray data. Journal of Computational Biology 7: 805- 817. [Pubmed]
7. Bayesian Models
- Baldi P, Long AD.
2001. A Bayesian framework for the analysis of microarray expression
data, regularized t -test and statistical inferences of gene changes.
Bioinformatics 17,509-19.[Pubmed]
- Broet P, Richardson S,
Radvanyi F. Bayesian hierarchical model for identifying changes in gene
expression from microarray experiments. J Comput Biol. 2002;9(4):671-83.
[Pubmed]
- Domingos, P. and Pazzani,
M. (1997). On the optimality of the simple Bayesian classifier under
zero-one loss. Machine Learning, 29, 103--130.[Citeseer]
- Friedman N, Linial M,
Nachman I, Pe'er D. Using Bayesian networks to analyze expression data J
Comput Biol. 2000;7(3-4):601-20.[Pubmed]
- Ibrahim, J.G., Chen,
M.H., and Gray, R.J. Bayesian models for gene expression with DNA
microarray data. Journal of the American Statistical Association 97:
88-99, 2002.[Pubmed]
- Kendziorski, C.M.,
M.A. Newton, H. Lan, and M.N. Gould. 2003. On parametric empirical Bayes
methods for comparing multiple groups using replicated gene expression
profiles. Statistics in Medicine, to appear.[Pubmed]
- Lee KE, Sha N,
Dougherty ER, Vannucci M, Mallick BK. Gene selection: a Bayesian variable
selection approach. Bioinformatics. 2003 Jan;19(1):90-7.[Pubmed]
- Townsend JP, Hartl DL.
Bayesian analysis of gene expression levels: statistical quantification
of relative mRNA level across multiple strains or treatments. Genome Biol
2002;3(12):RESEARCH0071[Pubmed]
- Theilhaber J, Bushnell
S, Jackson A, Fuchs R. 2001. Bayesian estimation of fold-changes in the
analysis of gene expression: the PFOLD algorithm. J Comput Biol
8:585-614.[Pubmed]
- Li Y, Campbell C,
Tipping M. 2002. Bayesian automatic relevance determination algorithms
for classifying gene expression data. Bioinformatics 18:1332-9. [Pubmed]
8. Other Models
- Zhou X, Wang X, Dougherty ER. 2003. Binarization of microarray data on
the basis of a mixture model. Mol Cancer Ther. 2003
Jul;2(7):679-84. [Pubmed]
- Kato M, Tsunoda T, Takagi T. 2000. Inferring genetic networks from DNA
microarray data by multiple regression analysis. Genome Inform Ser
Workshop Genome Inform. 11:118-28 [Pubmed]
- Li, C WH Wong 2001. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci U S A 98(1): 31-6.[Pubmed]
IV. Clustering (Supervised and Unsupervised, Gene
and Sample/ Classification
-
Alon U, Barkai N,
Notterman DA, Gish, K, Ybarra, S. Mack, D and Levine, AJ. 1999. Broad patterns
of gene expression revealed by clustering analysis of tumor and normal colon
tissues probed by oligonucleotide arrays Proc. Natl. Acad. Sci. USA. 96: 6745-
6750.[Pubmed]
-
Ball, G. and
Hall, D. A clustering technique for summarizing multivariate data. Behavioral
Science 12 (1967), 153-155[Pubmed]
-
Bagirov AM,
Ferguson B, Ivkovic S, Saunders G, Yearwood J. 2003. New algorithms for
multi-class cancer diagnosis using tumor gene expression signatures.
Bioinformatics. 2003 Sep 22;19(14):1800-7. [Pubmed]
-
Ben-Dor A, R
Shamir, and Z Yakhini.1999. Clustering gene expression patterns. Journal of
Computational Biology, 6(3/4):281-297.[Pubmed]
-
Cherepinsky V,
Feng J, Rejali M, Mishra B. Shrinkage-based similarity metric for cluster
analysis of microarray data. Proc Natl Acad Sci U S A. 2003 Aug
19;100(17):9668-73. Epub 2003 Aug 05. [Pubmed]
-
Dougherty ER,
Barrera J, Brun M, Kim S, Cesar RM, Chen Y, Bittner M, Trent JM. 2002.
Inference from clustering with application to gene-expression microarrays. J
Comput Biol. 9,105-26.[Pubmed]
-
Dudoit S and J.
Fridlyand (2002). A prediction-based resampling method to estimate the number
of clusters in a dataset. Genome Biology , Vol. 3, No. 7, p. 0036.1 --
0036.21.[Pubmed]
-
Dudoit S., J.
Fridlyand, and T. P. Speed (2002a). Comparison of discrimination methods for
the classification of tumors using gene expression data. Journal of the
American Statistical Association, Vol. 97, No. 457, p. 77--87.[Pubmed - no
entry]
-
Eisen, M.B.,
Spellman, P.T., Brown, P.O., and Botstein, D. 1998. Cluster analysis and
display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA.,
95,14863-14868.[Pubmed]
-
Gasch AP, Eisen
MB. 2002. Exploring the conditional coregulation of yeast gene expression
through fuzzy k-means clustering. Genome Biol. 3,RESEARCH0059.[Pubmed]
-
Hartuv E, Schmitt AO, Lange J,
Meier-Ewert S, Lehrach H, Shamir R. An algorithm for clustering cDNA
fingerprints. Genomics 2000 Jun 15;66(3):249-56[Pubmed]
-
Jain AK, Dubes RC. 1988 : Algorithms for
Clustering Data. Englewood Cliffs, NJ:Prentice-Hall.
-
Kluger Y, Basri R, Chang JT, Gerstein M.
Spectral biclustering of microarray data: coclustering genes and conditions.
Genome Res. 2003 Apr;13(4):703-16.[Pubmed]
-
Lee
Y, Lee CK. Classification of multiple cancer types by multicategory
support vector machines using gene expression data. Bioinformatics.
2003 Jun 12;19(9):1132-9. [Pubmed]
-
Li
H, Hong F. 2001. Cluster-Rasch models for microarray gene expression
data. Genome Biol. 2001;2(8):RESEARCH0031. Epub 2001 Jul 31 [Pubmed]
- McConnell
P, Johnson K, Lin S. 2002. Applications of Tree-Maps to hierarchical
biological data. Bioinformatics. 2002 Sep;18(9):1278-9.[Pubmed]
- McLachlan GJ, Bean RW, Peel D. 2002. A mixture
model-based approach to the clustering of microarray expression data.
Bioinformatics. 2002 Mar;18(3):413-22. [Pubmed]
- McLanchlan GJ, Bean RW, Peel D. 2002
Mixture model-based approach to the clustering of microarray. Bioinformatics.
18:3, 413-422.[Pubmed]
- McShane
LM, Radmacher MD, Freidlin B, Yu R, Li MC, Simon R. 2002. Methods for
assessing reproducibility of clustering patterns observed in analyses of
microarray data. Bioinformatics. 2002 Nov;18(11):1462-9. [Pubmed]
- Medvedovic M, Sivaganesan S. Bayesian
infinite mixture model based clustering of gene expression profiles.
Bioinformatics. 2002 Sep;18(9):1194-206.[Pubmed]
- Milligan, G. W. and M. C. Cooper (1986).
A study of the comparability of external criteria for hierarchical cluster
analysis. Multivariate Behavioral Research 21, 441--458.[Pubmed no entry]
- Nguyen DV, Rocke DM. Tumor classification by partial least squares using microarray
gene expression data. Bioinformatics. 2002 Jan;18(1):39-50. [Pubmed]
- Radmacher
MD, McShane LM, Simon R. 2002. A paradigm for class prediction using gene
expression profiles. J Comput Biol. 2002;9(3):505-11.[Pubmed]
- Romualdi
C, Campanaro S, Campagna D, Celegato B, Cannata N, Toppo S, Valle G,
Lanfranchi G. 2003. Pattern recognition in gene expression profiling using
DNA array: a comparative study of different statistical methods applied to
cancer classification. Hum Mol Genet. 2003 Apr 15;12(8):823-36. [Pubmed]
- Sawa T, Ohno-Machado L. A neural
network-based similarity index for clustering DNA microarray data. Comput Biol
Med. 2003 Jan;33(1):1-15[Pubmed]
- Sharan R., and Shamir R. 2000. CLICK: A
clustering algorithm with applications to gene expression analysis. In
Proceedings of the Eighth International Conference on Intelligent Systems for
Molecular Biology (ISMB), pages 307-316.[Pubmed - no entry]
- Sawa T, Ohno-Machado L. A neural
network-based similarity index for clustering DNA microarray data. Comput Biol
Med. 2003 Jan;33(1):1-15.[Pubmed]
- Simon
R, Radmacher MD, Dobbin K, McShane LM. 2003. Pitfalls in the use of DNA
microarray data for diagnostic and prognostic classification. J Natl Cancer Inst. 2003 Jan 1;95(1):14-8. [Pubmed]
- Sultan M, Wigle DA, Cumbaa CA, Maziarz M,
Glasgow J, Tsao MS, Jurisica I. 2002. Binary tree-structured vector
quantization approach to clustering and visualizing microarray data.
Bioinformatics Suppl 1,S111-S119. [Pubmed]
- Tibshirani, Hastie, Narashiman and Chu
(2002) Diagnosis of multiple cancer types by shrunken centroids
of gene expression PNAS 99:6567-6572. [Pubmed]
- Tsao, ECK, JC Bezdek and NR Pal. 1994.
Fuzzy Kohonen clustering networks. Pattern Recognition 27,757-764.[Pubmed no
entry]
- Valentini G. Gene expression data
analysis of human lymphoma using support vector machines and output coding
ensembles. Artif Intell Med. 2002 Nov;26(3):281-304. [Pubmed]
- Wuju L, Momiao X.Bioinformatics. 2002 Tclass: tumor classification system based on gene expression profile.
Feb;18(2):325-6. [Pubmed]
- Xing, E P. and R M. Karp. 2001. CLIFF:
Clustering of high-dimensional microarray data via iterative feature filtering
using normalized cuts. In Proceedings of the GCB.[Pubmed]
- Yeung KY, WL Ruzzo 2001a. Prinicipal
component analysis for clustering gene expression data. Bioinformatics 17:
763-774.[Pubmed]
- Yeung, K. Y., Haynor, D. R. and Ruzzo, W.
L. (2000) Validating clustering for gene expression data. Bioinformatics. 2001
Apr;17(4):309-18. [Pubmed]
- Yeung,
K. Y. C. Fraley, A. Murua, A. E. Raftery, and W. L. Ruzzo. 2001.
Model-based clustering and data transformations for gene expression data.
Bioinformatics 17:977--987, 2001.[Pubmed]
- Zhang
H, Yu CY, Singer B, Xiong M. Recursive partitioning for tumor
classification with gene expression microarray data.Proc Natl Acad Sci U
S A. 2001 Jun 5;98(12):6730-5. Epub 2001 May 29.[Pubmed]
- Zhang K, Zhao H. Assessing reliability of gene
clusters from gene expression data. Funct Integr Genomics. 2000
Nov;1(3):156-73. [Pubmed]
- Zhang H, Yu CY, Singer B, Xiong M. 2001. Recursive
partitioning for tumor classification with gene expression microarray data. Proc
Natl Acad Sci U S A. 2001 Jun 5;98(12):6730-5. Epub 2001 May 29. [Pubmed]
Multivariate Analysis
1. Alter, O., P.O.
Brown and D. Botstein. 2000. Singular value decomposition for genome-wide
expression data processing and modeling. Proc. Natl. Acad. Sci. USA,
97,10101-10106.[Pubmed]
2. Bicciato S,
Luchini A, Di Bello C. PCA disjoint models for multiclass cancer analysis using
gene expression data. Bioinformatics. 2003 Mar 22;19(5):571-8.[Pubmed]
3. Culhane AC,
Perriere G, Considine EC, Cotter TG, Higgins DG. Between-group analysis of microarray
data. Bioinformatics. 2002 Dec;18(12):1600-8.[Pubmed]
4. Fellenberg, K, NC
Hauser, B Brors, A Neutzner, JD Hohheisel and M Vingron. 2001. Correspondence
analysis applied to microarray data. Proc. Natl. Acad. Sci. USA.,
98,10781-10786.[Pubmed]
5. Ghosh D. 2002.
Resampling methods for variance estimation of singular value decomposition
analyses from microarray experiments. Funct Integr Genomics Aug;2(3):92-7[Pubmed]
6. Ghosh D. 2002.
Singular value decomposition regression models for classification of tumors
from microarray experiments. Pac Symp Biocomput,18-29.[Pubmed]
7. Kerr MK, Martin
M, Churchill GA. 2000. Analysis of variance for gene expression microarray
data. J Comput Biol 7, 819-37.[Pubmed]
8. Wall ME, Dyck PA,
Brettin TS. 2001 SVDMAN--singular value decomposition analysis of microarray
data. Bioinformatics 2001 Jun;17(6):566-8.[Pubmed]
V. Machine Learning
-
Brown, M, W.
Grundy, D. Lin, N. Cristianini, C. Sugnet, T. Furey, M. Jr, and D. Haussler.
2000. Knowledge-based analysis of microarray gene expression data by using
suport vector machines. Proc. Natl. Acad. Sci. 97:262-267..[Pubmed]
-
Furey, T.S., N.
Cristianini, N. Duffy, D.W. Bednarski, M. Schummer and D. Haussler. 2000.
Support vector machine classification and validation of cancer tissue samples
using microarray expression data. Bioinformatics, 16, 906-914.[Pubmed]
-
Kohavi R and GH.
John. 1997. Wrappers for feature subset selection. Artificial Intelligence,
97(1-2):273-324.[Pubmed no entry]
-
Koller D and M
Sahami. 1996. Toward optimal feature selection. In International Conference on
Machine Learning, pages 284-292.[Pubmed no entry]
-
Lee Y, Lee CK.
Classification of multiple cancer types by multicategory support vector
machines using gene expression data. Bioinformatics. 2003 Jun
12;19(9):1132-9.[Pubmed]
-
Lyons-Weiler J,
Patel S, Bhattacharya S. (2003) A classification-based machine learning
approach for the analysis of genome-wide expression data. Genome Res
Mar;13(3):503-12[Pubmed]
-
Xing E P., M I.
Jordan, and R M. Karp 2001a. Feature selection for high-dimensional genomic
microarray data. In Proc. 18th International Conf. on Machine Learning, pages
601-608, Morgan Kaufmann, San Francisco, CA.
-
Xiong M, Li W,
Zhao J, Jin L, Boerwinkle E. 2001. Feature (gene) selection in gene
expression-based tumor classification. Mol Genet Metab 73,239-47.[Pubmed]
-
Xiong M, X Fang,
J Zhao. 2001. Biomarker identification by feature wrappers. Genome Research,
11:1878-1887.[Pubmed]
-
Quinlan, R. 1994.
C4.5: programs for machine learning. Morgan Kaufmann.[Link]
-
Quinlan, R. 1996. Improved use of
continuous attributes in C4.5. JAIR 4,77-90[JAIR]
-
Rougemont J,
Hingamp P. DNA microarray data and contextual analysis of correlation
graphs. BMC Bioinformatics. 2003 Apr 29;4(1):15.[Pubmed]
Regression Trees
1. Breiman L,
Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. Belmont
(CA): Wadsworth International Group; 1984.[Amazon.com]
Bagging
1. Breiman.L 1996.
Bagging Predictors. Machine Learning 24(2): 123-140.[Pubmed no entry]
2. Dudoit S,
Fridlyand J. 2003. Bagging to improve the accuracy of a clustering procedure.
Bioinformatics. 12;19(9):1090-9. [Pubmed]
3. Hastie, T., R.
Tibshirani, J.H.F. Friedman. The Elements of Statistical Learning. (Springer) [Amazon.com]
Voting Methods/Boosting
-
Bijlani R, Cheng
Y, Pearce DA, Brooks AI, Ogihara M. 2003. Prediction of biologically
significant components from microarray data: Independently Consistent
Expression Discriminator (ICED). Bioinformatics. 2003 Jan;19(1):62-70.[Pubmed]
-
Dettling M, Buhlmann P. Boosting for tumor
classification with gene expression data. Bioinformatics. 2003 Jun
12;19(9):1061-9. [Pubmed]
-
Dudoit S,
Fridlyand J. Bagging to improve the accuracy of a clustering procedure.
Bioinformatics. 2003 Jun 12;19(9):1090-9. [Pubmed]
-
Schapire R.E., Y.
Freund, P. Barlett, W.S. Lee. 1998. Boosting the Margin: A new explanation for
the effectiveness of voting methods. The Annals of Statistics, vol.26, pp.
1651-1686.[Pubmed no entry]
Jackknife to Reduce False Positives
1. Lyons-Weiler J,
Patel S, Bhattacharya S. (2003) A classification-based machine learning
approach for the analysis of genome-wide expression data. Genome Res
Mar;13(3):503-12[Pubmed]
VI. Neural Networks/AI
-
Ando T, Suguro M,
Hanai T, Kobayashi T, Honda H, Seto M. 2002. Fuzzy neural network applied to
gene expression profiling for predicting the prognosis of diffuse large B-cell
lymphoma. Jpn J Cancer Res. 93,1207-12.[Pubmed]
-
Azuaje F. 2001. A
computational neural approach to support the discovery of gene function and
classes of cancer. IEEE Trans Biomed Eng. 48,332-9.[Pubmed]
-
Bicciato
S, Pandin M, Didone G, Di Bello C. Pattern identification and classification
in gene expression data using an autoassociative neural network model.
Biotechnol Bioeng. 2003 Mar 5;81(5):594-606.[Pubmed]
-
Bishop, C. M.
1995. Neural Networks for Pattern Recognition. Oxford University
Press.[Amazon.com]
-
Bishop, C. M.
1999. Bayesian PCA. In M. S. Kearns, S. A. Solla, and D. A. Cohn (Eds.),
Advances in Neural Information Processing Systems, Volume 11, pp. 382-388. MIT
Press [Amazon.com]
-
Deutsch JM.
Evolutionary algorithms for finding optimal gene sets in microarray
prediction. Bioinformatics. 2003 Jan;19(1):45-52.[Pubmed]
-
Dettling M,
Buhlmann P. Supervised clustering of genes. Genome Biol.
2002;3(12):RESEARCH0069. Epub 2002 Nov 25.[Pubmed]
-
Gasch AP, Eisen
MB. 2002. Exploring the conditional coregulation of yeast gene expression
through fuzzy k-means clustering. Genome Biol. 3,RESEARCH0059.[Pubmed]
-
Huntsberger T.L.
and Aijimarangsee P., 1992. Parallel self-organising feature maps for
unsupervised pattern recognition. In, Bezdek J.C. and Pal N.R, Editors, Fuzzy
models for pattern recognition, pp 483-495. IEEE Press, New York.[Amazon.com]
-
Jordan M.1995.
Why the logistic function? A tutorial discussion on probabilities and neural
networks. TR 9503, Computational Cognitive Science, MIT.[CiteSeer]
-
Jornsten R, Yu B.
2003. Simultaneous gene clustering and subset selection for sample
classification via MDL. Bioinformatics. 2003 Jun 12;19(9):1100-9.[Pubmed]
-
Khan J, Wei JS,
Ringnér M, Saal LH, Ladanyi M, Westermann F, et al. Classification and
diagnostic prediction of cancers using gene expression profiling and artificial
neural networks. Nat Med 2001;7:673-9.[Pubmed]
-
Kohonen T,
Somervuo P. How to make large self-organizing maps for nonvectorial data.
Neural Netw 2002 Oct-Nov;15(8-9):945-52[Pubmed]
-
Kohonen, T. 1982. Self-organized
formation of topologically correct feature map. Biol. Cybern. 43,59-69.[Pubmed]
-
Neal. R. M., 1996. Bayesian Learning for
Neural Networks, volume 118 of Lecture Notes in Statistics. Springer.[Amazon.com]
-
Mateos A, Dopazo J, Jansen R, Tu Y,
Gerstein M, Stolovitzky G. 2002 Systematic learning of gene functional
classes from DNA array expression data by using multilayer perceptrons. Genome
Res. Nov;12(11):1703-15. [Pubmed]
-
O'Neill MC, Song L. Neural network analysis
of lymphoma microarray data: prognosis and diagnosis near-perfect. BMC
Bioinformatics. 2003 Apr 10;4(1):13. [Pubmed]
-
Ringner M, Peterson C. Microarray-based
cancer diagnosis with artificial neural networks. Biotechniques. 2003
Mar;Suppl:30-5.[Pubmed]
-
Ripley BD. Pattern recognition and neural
networks. Cambridge (U.K.): Cambridge University Press; 1996.[Amazon.com]
-
Sawa T, Ohno-Machado L. A neural
network-based similarity index for clustering DNA microarray data. Comput Biol
Med. 2003 Jan;33(1):1-15[Pubmed]
-
Selaru FM, Xu Y, Yin J, Zou T, Liu TC,
Mori Y, Abraham JM, Sato F, Wang S, Twigg C, Olaru A, Shustova V, Leytin A,
Hytiroglou P, Shibata D, Harpaz N, Meltzer SJ. 2002. Artificial neural networks
distinguish among subtypes of neoplastic colorectal lesions. Gastroenterology
3,606-13.[Pubmed]
-
Tipping, M.E. and C.M. Bishop 1999.
Mixtures of probabilistic principal component analyzers. Neural Computation,
11(2):443-482, 1999.[Pubmed]
-
Tomida S, Hanai T, Honda H, Kobayashi T.
2002. Analysis of expression profile using fuzzy adaptive resonance theory.
Bioinformatics 18,1073-83.[Pubmed]
-
Tsao, ECK, JC Bezdek and NR Pal. 1994.
Fuzzy Kohonen clustering networks. Pattern Recognition 27,757-764.[Pubmed no
entry]
VII. Computational Validation
1. Dudoit S., J.
Fridlyand, and T. P. Speed (2002a). Comparison of discrimination methods for
the classification of tumors using gene expression data. Journal of the
American Statistical Association, Vol. 97, No. 457, p. 77--87.[Pubmed]
2. Efron B,
Tibshirani R. Improvements on cross-validation: the .632+ bootstrap method. J
Am Stat Assoc 1997;92:548-60[Pubmed]
3. Landgrebe J,
Wurst W, Welzl G. Permutation-validated principal components analysis of
microarray data. Genome Biol. 2002;3(4):RESEARCH0019. Epub 2002 Mar 22. [Pubmed]
4. Felsenstein, J.
1985. Confidence limits on phylogenies: An approach using the bootstrap.
Evolution 39: 666-670.[Pubmed]
VIII. Evaluation/Comparisons
1. Dudoit S., J.
Fridlyand, and T. P. Speed (2002a). Comparison of discrimination methods for
the classification of tumors using gene expression data. Journal of the
American Statistical Association, Vol. 97, No. 457, p. 77--87.[Pubmed].
2. William J. Lemon,
Jeffrey J.T. Palatini, Ralf Krahe and Fred A. Wright (2002) Theoretical and
experimental comparisons of gene expression indexes for oligonucleotide
arrays. Bioinformatics. Vol 18, 1470-1476. [Pubmed]
3. Kooperberg C,
Sipione S, LeBlanc M, Strand AD, Cattaneo E, Olson JM. 2002. Evaluating test
statistics to select
interesting genes in microarray experiments. Hum Mol
Genet 11,2223-32.[Pubmed]
4. Pan, W. (2002) A
Comparative Review of Statistical Methods for Discovering Differentially
Expressed Genes in Replicated Microarray Experiments. Bioinformatics,
12, 546-554 [Pubmed]
5. Powell DA,
Anderson LM, Cheng RY, Alvord WG 2002. Robustness of the Chen-Dougherty-Bittner
procedure against non-normality and heterogeneity in the coefficient of
variation. J Biomed Opt. 2002 Oct;7(4):650-60. [Pubmed]
IX. Regulatory Networks
- Kato
M, Tsunoda T, Takagi T. 2000. Inferring genetic networks from DNA
microarray data by multiple regression analysis. Genome Inform Ser
Workshop Genome Inform. 11:118-28 [Pubmed]
- Peterson
LE. Partitioning large-sample microarray-based gene expression profiles
using principal components analysis. Comput Methods Programs Biomed. 2003
Feb;70(2):107-19. [Pubmed]
- Bagirov
AM, Ferguson B, Ivkovic S, Saunders G, Yearwood J. New algorithms for
multi-class cancer diagnosis using tumor gene expression
- Yoo
C, Cooper GF. Discovery of gene-regulation pathways using local causal
search. Proc AMIA Symp. 2002;:914-8.[Pubmed]
signatures. Bioinformatics. 2003 Sep 22;19(14):1800-7. [Pubmed]
X. Functional Interpretation
- Draghici S, Khatri P,
Martins RP, Ostermeier GC, Krawetz SA. Global functional profiling of gene
expression. Genomics. 2003 Feb;81(2):98-104. [Pubmed]
- Masys DR, Welsh JB, Fink JL,
Gribskov M, Klacansky I, Corbeil J. Use of keyword hierarchies to
interpret gene expression patterns. Bioinformatics, 2001, April;
7(4):319-26.[Pubmed]
- Fink JL, Drewes S, Patel H,
Welsh JB, Masys DR, Corbeil J, Gribskov M.2HAPI: a microarray data
analysis system. Bioinformatics. 2003 Jul 22;19(11):1443-5. [Pubmed]
XI. Integrating diverse genomic and
proteomic data sources
- The Gene Ontology
Consortium. 2000. Gene Ontology: tool for the unification of biology.
Nature Genetics 25: 25-29.[Pubmed]
- Fickett JW, Wasserman WW
2000. Discovery and modeling of transcriptional regulatory regions. Curr
Opin Biotechnol. 2000 Feb;11(1):19-24.[Pubmed]
- Suzuki S, Moore DH 2nd,
Ginzinger DG, Godfrey TE, Barclay J, Powell B, Pinkel D, Zaloudek C, Lu K,
Mills G, Berchuck A, Gray JW. 2000. An approach to analysis of large-scale
correlations between genome changes and clinical endpoints in ovarian
cancer. Cancer Res. 2000 Oct 1;60(19):5382-5.[Pubmed]
- Walhout AJ, Reboul J,
Shtanko O, Bertin N, Vaglio P, Ge H, Lee H, Doucette-Stamm L, Gunsalus KC,
Schetter AJ, Morton DG, Kemphues KJ, Reinke V, Kim SK, Piano F, Vidal M.
2002. Integrating interactome, phenome, and transcriptome mapping data for
the C. elegans germline. Curr Biol. 2002 Nov 19;12(22):1952-8.[Pubmed]
- Zhou X, Kao MC, Wong WH.
2002. Transitive functional annotation by shortest-path analysis of gene
expression data. Proc Natl Acad Sci U S A. 2002 Oct 1;99(20):12783-8. Epub
2002 Aug 26.[Pubmed]
XI. Other Software
At the NCI
- Gene Expression Data Portal
- BRB-ArrayTool ( http://linus.nci.nih.gov/BRB-ArrayTools.html)
{Pubmed}
- GOMiner
- CIM Maker
(Color-Coded Image Map)
- Gene Expression Data
Analysis Workbench
- Gene Expression Data Portal (http://gedp.nci.nih.gov/dc/index.jsp)
Elsewhere
- Bioconductor (http://www.bioconductor.org)/)
Dudoit S, Gentleman RC, Quackenbush J. 2003. Open source software for the
analysis of microarray data. Biotechniques. 2003
Mar;Suppl:45-51. [Pubmed]
- OntoTools (http://vortex.cs.wayne.edu/projects.htm) [Pubmed]
- OncoMine (http://www.oncomine.org/) [Pubmed]
- TIGR MeV (http://www.tigr.org/software/tm4/index.html)
[Pubmed]
- See a list of other software
tool at the Stanford
Microarray Database page.
last updated 11/06/03 by JLW