R语言 dgof包 ks.test()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-16 19:37:45

ks.test(dgof)
ks.test()所属R语言包：dgof

                                    Kolmogorov-Smirnov Tests
                                       柯尔莫哥洛夫 - 斯米尔诺夫测试

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Performs one or two sample Kolmogorov-Smirnov tests.
执行一个或两个样本Kolmogorov-Smirnov测试。

用法----------Usage----------

ks.test(x, y, ...,
      alternative = c("two.sided", "less", "greater"),
      exact = NULL, tol=1e-8, simulate.p.value=FALSE, B=2000)

参数----------Arguments----------

参数：x
a numeric vector of data values.
的数据值的一个数值向量。

参数：y
a numeric vector of data values, or a character string naming a cumulative distribution function or an actual cumulative distribution function such as pnorm. Alternatively, y can be an ecdf function (or an object of class stepfun) for specifying a discrete distribution.
一个数值向量的数据值，或者是一个字符串，命名的累积分布函数，或实际的累积分布函数如pnorm。另外，y是ecdf功能（或对象类stepfun）指定一个离散分布的。

参数：...
parameters of the distribution specified (as a character string) by y.
参数指定分配（如字符串）y。

参数：alternative
indicates the alternative hypothesis and must be one of "two.sided" (default), "less", or "greater".  You can specify just the initial letter of the value, but the argument name must be give in full. See "Details" for the meanings of the possible values.
表示另一种假设，必须有一个"two.sided"（默认），"less"或"greater"。您可以指定的首字母的价值，但在参数名称必须全面。请参阅“详细信息”的含义，可能的值。

参数：exact
NULL or a logical indicating whether an exact p-value should be computed.  See "Details" for the meaning of NULL.  Not used for the one-sided two-sample case.
NULL一个确切的p值指示是否应计算或逻辑。请参阅“详细资料”的意思NULL。不用于片面的两个样本的情况下。

参数：tol
used as an upper bound for possible rounding error in values (say, a and b) when needing to check for equality (a==b); value of NA or 0 does exact comparisons but risks making errors due to numerical imprecisions.
作为一个上限可能舍入误差值（说，a和b）时，需要检查是否相等（a==b）; NA或0确切的比较，但由于数值不精确的决策失误的风险。

参数：simulate.p.value
a logical indicating whether to compute p-values by Monte Carlo simulation, for discrete goodness-of-fit tests only.
一个逻辑指示是否计算p-值的蒙特卡罗模拟，离散善良的合适的测试。

参数：B
an integer specifying the number of replicates used in the Monte Carlo test (for discrete goodness-of-fit tests only).
一个整数，指定重复次数在蒙特卡洛测试（为离散善良的拟合检验只）。

Details

详细信息----------Details----------

If y is numeric, a two-sample test of the null hypothesis that x and y were drawn from the same continuous distribution is performed.
如果y是数字，两样本检验的零假设，这x和y都来自相同的连续分布进行。

Alternatively, y can be a character string naming a continuous (cumulative) distribution function (or such a function), or an ecdf function (or object of class stepfun) giving a discrete distribution.  In these cases, a one-sample test is carried  out of the null that the distribution function which generated x is distribution y with parameters specified by ....
另外，y可以是字符串，命名连续（累计）分布函数（或函数），或ecdf功能（或对象类stepfun）离散分布。在这种情况下，一个样本进行测试的空分布函数产生x是分布y参数指定的...。

The presence of ties generates a warning unless y describes a discrete distribution (see above), since continuous distributions do not generate them.
的存在下关系产生一个警告信息，除非y描述离散分布（见上文），因为不连续分布生成它们。

The possible values "two.sided", "less" and "greater" of alternative specify the null hypothesis that the true distribution function of x is equal to, not less than or not greater than the hypothesized distribution function (one-sample case) or the distribution function of y (two-sample case), respectively.  This is a comparison of cumulative distribution functions, and the test statistic is the maximum difference in value, with the statistic in the "greater" alternative being D^+ = max[F_x(u) - F_y(u)]. Thus in the two-sample case alternative="greater" includes distributions for which x is stochastically smaller than y (the CDF of x lies above and hence to the left of that for y), in contrast to t.test or wilcox.test.
可能的值"two.sided"，"less"和"greater"alternative指定零假设，真正的的分布函数x是等于，小于或不大于虚拟的分布函数（单样本情况下）或分布函数y（两个样本的情况下），分别。这是一个累积分布函数，比较和检验统计量是最大的差额，是"greater"D^+ = max[F_x(u) - F_y(u)]替代的统计。因此，在两个样品的情况下alternative="greater"包含分布的x是随机小于y（的CDF的x位于上方，因此左边那 y）对比t.test或wilcox.test。

Exact p-values are not available for the one-sided two-sample case, or in the case of ties if y is continuous.  If exact = NULL (the default), an exact p-value is computed if the sample size is less than 100 in the one-sample case with y continuous or if the sample size is less than or equal to 30 with y discrete; or if the product of the sample sizes is less than 10000 in the two-sample case for continuous y.  Otherwise, asymptotic distributions are used whose approximations may be inaccurate in small samples.  With y continuous, the one-sample two-sided case, exact p-values are obtained as described in Marsaglia, Tsang & Wang (2003); the formula of Birnbaum & Tingey (1951) is used for the one-sample one-sided case.
精确的p值不片面的两个样本的情况下，或在关系的情况下，如果y是连续的。如果exact = NULL（默认值），一个精确的p-值计算的样本量小于100 y连续或者如果样本量小于或等于一个样本的情况下， 30 y离散，或如果该产品的样本量小于10000的两个样本的情况下连续y。否则，使用渐近分布的近似可能不准确，在小样本。随着y连续，单样本两面的情况下，精确的p-值获得的马尔萨利亚，曾王（2003）中描述的;伯恩鲍姆＆廷吉式（1951）用于一个样品，片面的情况下。

In the one-sample case with y discrete, the methods presented in Conover (1972) and Gleser (1985) are used when exact=TRUE (or when exact=NULL) and length(x)<=30 as described above.  When exact=FALSE or exact=NULL with length(x)>30, the test is not exact and the resulting p-values are known to be conservative.  Usage of exact=TRUE with sample sizes greater than 30 is not advised due to numerical instabilities; in such cases, simulated p-values may be desirable.
在单样本情况下y离散，康诺弗（1972）和Gleser（1985年）中提出的方法是使用时exact=TRUE（或当exact=NULL）length(x)<=30如上所述。当exact=FALSE或exact=NULLlength(x)>30，测试不准确，导致p值被称为是保守的。用途exact=TRUE与样本大小大于30不建议由于数值不稳定性，在这种情况下，模拟的p-值可能是可取的。

If a single-sample test is used with y continuous, the parameters specified in ... must be pre-specified and not estimated from the data. There is some more refined distribution theory for the KS test with estimated parameters (see Durbin, 1973), but that is not implemented in ks.test.
如果单样品测试使用y连续，指定的...必须预先指定的，而不是从数据估计的参数。有一些更精致的KS检验与估计参数（德宾1973年）的分布理论，但没有实现在ks.test。

值----------Value----------

A list with class "htest" containing the following components:
列表类"htest"包含以下组件：

参数：statistic
the value of the test statistic.
检验统计量的值。

参数：p.value
the p-value of the test.
的p值的测试。

参数：alternative
a character string describing the alternative hypothesis.
一个字符串，描述了另一种假设。

参数：method
a character string indicating what type of test was performed.
一个字符串，表示什么类型的测试进行。

参数：data.name
a character string giving the name(s) of the data.
给予（s）的数据的名称的字符串。

（作者）----------Author(s)----------

Modified by Taylor B. Arnold and John W. Emerson to include
one-sample testing with a discrete distribution (as presented
in Conover's 1972 paper – see references).

参考文献----------References----------

One-sided confidence contours for probability distribution functions. The Annals of Mathematical Statistics, 22/4, 592–596.
Practical Nonparametric Statistics. New York: John Wiley & Sons. Pages 295–301 (one-sample Kolmogorov test), 309–314 (two-sample Smirnov test).
A Kolmogorov Goodness-of-Fit Test for Discontinuous Distributions. Journal of American Statistical Association, Vol. 67, No. 339, 591–596.
Exact Power of Goodness-of-Fit Tests of Kolmogorov Type for Discontinuous  Distributions. Journal of American Statistical Association, Vol. 80, No. 392,  954–958.
Distribution theory for tests based on the sample distribution function.  SIAM.
Evaluating Kolmogorov's distribution. Journal of Statistical Software, 8/18. http://www.jstatsoft.org/v08/i18/.

参见----------See Also----------

shapiro.test which performs the Shapiro-Wilk test for normality; cvm.test for Cramer-von Mises type tests.
shapiro.test执行Shapiro-Wilk检验，正态性;cvm.test克拉美·冯·米塞斯的型式试验。

实例----------Examples----------

require(graphics)
require(dgof)

set.seed(1)

x <- rnorm(50)
y <- runif(30)
# Do x and y come from the same distribution?[x和y来自相同的分布？]
ks.test(x, y)
# Does x come from a shifted gamma distribution with shape 3 and rate 2?[的x从移动伽玛分布的形状和速度2？]
ks.test(x+2, "pgamma", 3, 2) # two-sided, exact[双面，确切]
ks.test(x+2, "pgamma", 3, 2, exact = FALSE)
ks.test(x+2, "pgamma", 3, 2, alternative = "gr")

# test if x is stochastically larger than x2[测试如果x是随机大于X2]
x2 <- rnorm(50, -1)
plot(ecdf(x), xlim=range(c(x, x2)))
plot(ecdf(x2), add=TRUE, lty="dashed")
t.test(x, x2, alternative="g")
wilcox.test(x, x2, alternative="g")
ks.test(x, x2, alternative="l")

#########################################################[＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃]
# TBA, JWE new examples added for discrete distributions:[TBA，JWE新的实例添加离散分布：]

x3 <- sample(1:10, 25, replace=TRUE)

# Using ecdf() to specify a discrete distribution:[厄立特里亚社区发展基金（）来指定一个离散分布：]
ks.test(x3, ecdf(1:10))

# Using step() to specify the same discrete distribution:[使用步骤（）来指定相同的离散分布：]
myfun <- stepfun(1:10, cumsum(c(0, rep(0.1, 10))))
ks.test(x3, myfun)

# The previous R ks.test() does not correctly calculate the[在以前的的řks.test（）不能正确计算]
# test statistic for discrete distributions (gives warning):[对于离散分布的检验统计量显示警告：]
# stats::ks.test(c(0, 1), ecdf(c(0, 1)))[厄立特里亚社区发展基金统计:: ks.test（三（0，1），（三（0，1）））]
# ks.test(c(0, 1), ecdf(c(0, 1)))[ks.test（三（0，1），厄立特里亚社区发展基金（三（0，1）））]

# Even when the correct test statistic is given, the[即使正确的检验统计量，]
# previous R ks.test() gives conservative p-values:[以前的：řks.test（）给出了保守的p-值：]
stats::ks.test(rep(1, 3), ecdf(1:3))
ks.test(rep(1, 3), ecdf(1:3))
ks.test(rep(1, 3), ecdf(1:3), simulate=TRUE, B=10000)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册