>> Noel O'Boyle

The bootstrap estimate of misclassification rates

There are two R packages dealing with the bootstrap statistic, 'boot' and 'bootstrap'. The former is related to "Bootstrap methods and their applications" by Davison and Hinkley (1996) whereas the latter is based upon the book "An introduction to the bootstrap" by Efron and Tibshirani (1993). The book by Efron and Tibshirani is very good - read it if you can (I haven't read the other book). This page relates to the algorithms described in chapter 17.

Neither package has a good example showing how to estimate misclassification rates (indeed, I am not sure whether 'boot' has a suitable function at all). In any case, here's how you do it for the iris data in 'MASS' where decision trees are used to classify the samples.

library(bootstrap)
library(tree)
theta.fit <- function(x,y) {tree(y ~ ., as.data.frame(x))}
theta.predict <- function(fit,x) {predict(fit,as.data.frame(x), type="class")}
miss.class <- function(y,yhat) { 1*(yhat!=y) }
ans <- bootpred(iris[,1:4],iris[,5],100,theta.fit,theta.predict,miss.class)
ans

With the result:

[[1]]
[1] 0.02666667
 
[[2]]
[1] 0.01773333
 
[[3]]
[1] 0.04642104

which are the apparent error rate (4/150), the optimism (2.66/150), and the ".632" estimate of prediction error (7/150) respectively. The bootstrap estimate of prediction error is 6.66/150 (found by adding the apparent error rate and the optimism). According to Efron and Tibshirani (1993), CV is roughly unbiased but can show large variability; the simple bootstrap (not shown above) has lower variability but can be severely biased downwards; the more refined bootstrap (6.66/150) is an improvement but still suffers from downward bias; "in the few studies to date" the .632 estimator performed the best. It is interesting to note that in this case, the CV misclassification rate equals the apparent error rate (crossvalidation of supervised classification algorithms).

In 1997, Efron and Tibshirani published a paper on an improved estimator of misclassification rates, called ".632+" (J. Am. Statis. Assoc., 1997, 92, 548). This estimator combines low variance with only moderate bias. This is available using the errorest function in the ipred package as follows:

library(ipred)
mypredict.tree <- function(fit,newdata) {predict(fit, newdata, type="class")}
ans <- errorest(Species ~ ., iris, model=tree, estimator="632plus", predict=mypredict.tree,
       est.para=control.errorest(nboot=100))
#         .632+ Bootstrap estimator of misclassification error
#                  with 100 bootstrap replications
#                   
#                   Misclassification error:  0.0475

The same function can also calculate a bootstrap estimate. This doesn't appear to be the same value as that found using 'bootpred'. It does have the advantage that it gives a standard deviation for the misclassification error:

ans <- errorest(Species ~ ., iris, model=tree, estimator="boot", predict=mypredict.tree,
       est.para=control.errorest(nboot=100))
ans
#               Bootstrap estimator of misclassification error
#                        with 100 bootstrap replications
#                         
#                         Misclassification error:  0.0603
#                         Standard deviation: 0.0028

Other models

The examples above deal with decision trees. What about another model, like LDA?

library(ipred)
mypredict.lda <- function(fit,newdata) {predict(fit, newdata)$class}
ans <- errorest(Species ~ ., iris, model=lda, estimator="632plus", predict=mypredict.lda,
       est.para=control.errorest(nboot=100))

Here, the misclassification error is predicted to be 0.0235 (3.5/150), which is not much more than the apparent error rate 3/150 (LOOCV is also 3/150).