Supervised learning (no date) [incomplete]

This page is moving to a new website.

Here are some documented examples of how to use supervised learning methods of the analysis of microarray data.

Supervised learning

The Bioconductor package has a sample data set, golubEsets, that is available on the web at

http://www-genome.wi.mit.edu/mpr/data_set_ALL_AML.html

and is based on the publication

Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh M, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Science 1999: 286; 531-537. [PDF]

To get this data, use the following commands:

library("marray")
library("golubEsets")
data(golubTrain)
data(golubTest)
data(golubMerge)

The phenotypic data for the training data set looks like

> pData(golubTrain)

  
Samples ALL.AML BM.PB T.B.cell ... Source
1        1     ALL    BM   B-cell ...   DFCI
2        2     ALL    BM   T-cell ...   DFCI
3        3     ALL    BM   T-cell ...   DFCI
.
.
.
32      32     AML    BM     <NA> ...  CALGB
33      33     AML    BM     <NA> ...  CALGB

table(pData(golubTrain)$ALL.AML)

ALL AML
27 11

table(pData(golubTest)$ALL.AML)

ALL AML
20 14

table(pData(golubMerge)$ALL.AML)

ALL AML
47 25

x <- exprs(GolubTrain)

Supervised learning (class prediction)

Regression analysis, discriminant analysis, cart, neural net

support vector machines: http://www.acm.org/sigs/sigkdd/explorations/issue2-2/bennett.pdf