What is Resampling?

Resampling is a method which researchers use to determine where their model is accurate enough or not and also find different problem of their model. The common process in machine learning is taking a part of all data and use it as a validation set, the method which is called Cross-Validation resampling.

1-Randomization exact test:

Randomization exact test is a test procedure in which data arerandomly re-assigned so that an exact p-value is calculated based on the permutateddata.

2-Cross validation

Simple cross-validation. Take regression as an example. In the process of implementinga simple cross-validation, the first sub-sample is usually used for deriving the regressionequation while another sub-sample is used for generating predicted scores from the firstregression equation. Next, the cross-validity coefficient is computed by correlating thepredicted scores and the observed scores on the outcome variable.

Double cross-validation. Double cross-validation is a step further than its simplecounterpart. Take regression as an example again. In double cross-validation regressionequations are generated in both sub-samples, and then both equations are used togenerate predicted scores and cross-validity coefficients.

Multicross-validation. Multicross-validation is an extension of double cross-validation.In this form of cross-validation, double cross-validation procedures are repeated manytimes by randomly selecting sub-samples from the data set. In the context of regressionanalysis, beta weights computed in each sub-sample are used to predict the outcomevariable in the corresponding sub-sample. Next, the observed and predicted scores of theoutcome variable in each sub-sample are used to compute the cross validated coefficient.

3-Jackknife

Jackknife is a step beyond cross-validation. In Jackknife, the same test is repeated byleaving one subject out each time. Thus, this technique is also called leave one out. Thisprocedure is especially useful when the dispersion of the distribution is wide or extremescores are present in the data set. In these cases it is expected that Jackknife couldreturn a bias-reduced estimation.

4-Bootstrap

in bootstrap, the originalsample could be duplicated as many times as the computing resources allow, and thenthis expanded sample is treated as a virtual population. Then samples are drawn fromthis population to verify the estimators. Obviously the "source" for resampling inbootstrap could be much larger than that in the other two. In addition, unlike crossvalidation and Jackknife, the bootstrap employs sampling with replacement

Resampling is a method which researchers use to determine where their model is accurate enough or not and also find different problem of their model. The common process in machine learning is taking a part of all data and use it as a validation set, the method which is called Cross-Validation resampling.

1-Randomization exact test:

Randomization exact test is a test procedure in which data arerandomly re-assigned so that an exact p-value is calculated based on the permutateddata.

2-Cross validation

Simple cross-validation. Take regression as an example. In the process of implementinga simple cross-validation, the first sub-sample is usually used for deriving the regressionequation while another sub-sample is used for generating predicted scores from the firstregression equation. Next, the cross-validity coefficient is computed by correlating thepredicted scores and the observed scores on the outcome variable.

Double cross-validation. Double cross-validation is a step further than its simplecounterpart. Take regression as an example again. In double cross-validation regressionequations are generated in both sub-samples, and then both equations are used togenerate predicted scores and cross-validity coefficients.

Multicross-validation. Multicross-validation is an extension of double cross-validation.In this form of cross-validation, double cross-validation procedures are repeated manytimes by randomly selecting sub-samples from the data set. In the context of regressionanalysis, beta weights computed in each sub-sample are used to predict the outcomevariable in the corresponding sub-sample. Next, the observed and predicted scores of theoutcome variable in each sub-sample are used to compute the cross validated coefficient.

3-Jackknife

Jackknife is a step beyond cross-validation. In Jackknife, the same test is repeated byleaving one subject out each time. Thus, this technique is also called leave one out. Thisprocedure is especially useful when the dispersion of the distribution is wide or extremescores are present in the data set. In these cases it is expected that Jackknife couldreturn a bias-reduced estimation.

4-Bootstrap

in bootstrap, the originalsample could be duplicated as many times as the computing resources allow, and thenthis expanded sample is treated as a virtual population. Then samples are drawn fromthis population to verify the estimators. Obviously the "source" for resampling inbootstrap could be much larger than that in the other two. In addition, unlike crossvalidation and Jackknife, the bootstrap employs sampling with replacement

- Send Comment
- Visited : 2