nlcv(nlcv)
nlcv()所属R语言包:nlcv
Nested Loop Cross-Validation
嵌套循环交叉验证
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This function first proceeds to a feature selection and then applies five different classification algorithms.
此功能一个功能选择的第一个收益,然后采用5个不同的分类算法。
用法----------Usage----------
nlcv(eset, classVar = "type", nRuns = 2, propTraining = 2/3, classdist = c("balanced", "unbalanced"),
nFeatures = c(2, 3, 5, 7, 10, 15, 20, 25, 30, 35), fsMethod = c("randomForest", "t.test", "limma", "none"),
classifMethods = c("dlda", "randomForest", "bagg", "pam", "svm"), fsPar = NULL, initialGenes = seq(length.out = nrow(eset)),
geneID = "ID",
storeTestScores = FALSE, verbose = FALSE)
参数----------Arguments----------
参数:eset
ExpressionSet object containing the genes to classify
ExpressionSet对象,其中包含的基因分类
参数:classVar
String giving the name of the variable containing the observed class labels
使观测到的类标签包含的变量的名称的字符串
参数:nRuns
Number of runs for the outer loop of the cross-validation
班次为外环交叉验证
参数:propTraining
Proportion of the observations to be assigned to the training set. By default propTraining = 2/3.
比例被分配到训练集的意见。默认情况下propTraining = 2/3。
参数:classdist
distribution of classes; allows to indicate whether your distribution 'balanced' or 'unbalanced'. The sampling strategy for each run is adapted accordingly.
类分布;允许表明,无论是你的发行的“平衡”或“不平衡”。为每个运行的取样策略相应调整。
参数:nFeatures
Numeric vector with the number of features to be selected from the features kept by the feature selection method. For each number n specified in this vector the classification algorithms will be run using only the top n features.
数字矢量与特征选择方法保存的特点选择功能。在此向量中指定的每个数n为分类算法将运行使用只有前n功能。
参数:fsMethod
Feature selection method; one of "randomForest", "t.test", "limma" or "none".
特征选择方法;"randomForest"之一,"t.test","limma"或"none"。
参数:classifMethods
character vector with the classification methods to be used in the analysis; elements can be chosen among "dlda", "lda", "nlda", "dlda", "randomForest", "bagg", "pam", "svm", "ksvm"
与分类方法,在分析中使用的字符向量元素,可以选择其中"dlda","lda","nlda","dlda","randomForest","bagg","pam","svm","ksvm"
参数:fsPar
List of further parameters to pass to the feature selection method; currently the default for "randomForest" is an empty list() whereas for "t.test", one can specify the particular test to be used (the default being list(test = "f").
进一步的参数列表传递到特征选择方法;目前的默认"randomForest"空list()而对于"t.test",可以指定要使用特定的测试(默认为list(test = "f")。
参数:geneID
string representing the name of the gene ID variable in the fData of the expression set to use; this argument was added for people who use e.g. both Entrez IDs and Ensemble gene IDs
代表在表达fdata中的基因ID变量名的字符串设置为使用这种说法的人使用,例如添加Entrez的ID和合奏基因标识
参数:initialGenes
Initial subset of genes in the ExpressionSet on which to apply the nested loop cross validation procedure. By default all genes are selected.
基因在ExpressionSet的初始子集上应用的嵌套循环交叉验证程序。默认情况下,所有的基因都被选中。
参数:storeTestScores
should the test scores be stored in the nlcv object? Defaults to FALSE
考试成绩应该被储存在nlcv对象? FALSE默认
参数:verbose
Should the output be verbose (TRUE) or not (FALSE).
输出应详细(TRUE)或没有(FALSE)。
Details
详情----------Details----------
TODO
待办事项
值----------Value----------
The result is an object of class 'nlcv'. It is a list with two components, output and features.
结果是类的nlcv的对象。它是一个带有两个组件列表中,output和features。
De output component is a list of five components, one for each classification algorithm used. Each of these components has as many components as there are elements in the nFeatures vector. These components contain both the error rates for each run (component errorRate) and the predicted labels for each run (character matrix labelsMat).
德output组件是一个五个组件的列表,每个分类所使用的算法之一。这些组件中的每个人都有作为有nFeatures向量中的元素,如许多组件。这些组件包含每次运行错误率(组件errorRate),为每个运行的预测标签(字符矩阵labelsMat)。
The features list is a list with as many components as there are runs. For each run, a named vector is given with the variable importance measure for each gene. For t test based feature selection, P-values are used; for random forest based feature selection the variable importance measure is given.
features列表是一个与许多组件的列表,有运行。每次运行,一个名为向量与每个基因变量的重要性措施。 t检验的特征选择,P值均采用基于随机森林的功能选择变量的重要性措施。
注意----------Note----------
The variable importance measure used is the third column of the output returned by the randomForest function.
用于测量变量的重要性是由randomForest函数返回的输出的第三列。
作者(S)----------Author(s)----------
Willem Talloen and Tobias Verbeke
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|