找回密码
 注册
查看: 1332|回复: 0

R语言 rebmix包 REBMIX()函数中文帮助文档(中英文对照)

  [复制链接]
发表于 2012-9-25 19:47:46 | 显示全部楼层 |阅读模式
REBMIX(rebmix)
REBMIX()所属R语言包:rebmix

                                         REBMIX Algorithm for Univariate or Multivariate Finite Mixture Estimation
                                         REBMIX单变量和多变量的有限混合估计算法

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Returns the REBMIX algorithm output for mixtures of conditionally independent normal, lognormal, Weibull, gamma, binomial, Poisson or Dirac component densities.
返回的REBMIX的算法,输出有条件独立的正态,对数正态分布,Weibull分布,伽玛,二项分布,泊松分布或狄拉克元件密度的混合物。


用法----------Usage----------


REBMIX(Dataset = NULL, Preprocessing = NULL, D = 0.025, cmax = 15,
       Criterion = "AIC", Variables = NULL, pdf = NULL,
       Theta1 = NULL, Theta2 = NULL, K = NULL, ymin = NULL,
       ymax = NULL, b = 1.0, ar = 0.1, Restraints = "loose")



参数----------Arguments----------

参数:Dataset
a list of data frames of size n \times d containing d-dimensional datasets. Each of the d columns represents one random variable. Number of observations n equals the number of rows in the datasets.  
数据框的大小n \times d含有D三维数据集的列表。 d列中的每一个代表一个随机变量。观测n数等于行的数据集。


参数:Preprocessing
a character vector, giving the preprocessing types. One of "histogram", "Parzen window" or "k-nearest neighbour".  
字符向量,给人的预处理。 "histogram","Parzen window"或"k-nearest neighbour"之一。


参数:D
a total of positive relative deviations standing for the maximum acceptable measure of distance between predictive and empirical densities. It satisfies the relation 0 ≤q D ≤q 1. The default value is 0.025. However, if components with a low probability of occurrence are expected, it has to decrease. If D = 0 and b = 0 the mixture tends to c = c_{\mathrm{max}} components.  
一个总的积极的相对偏差可接受的最大程度的预测和经验密度之间的距离。它满足关系0 ≤q D ≤q 1。默认值是0.025。然而,如果发生概率较低的组件与预期,它具有降低。如果D = 0和b = 0混合物往往c = c_{\mathrm{max}}组件。


参数:cmax
maximum number of components c_{\mathrm{max}} > 0. The default value is 15.  
最大数量的组件c_{\mathrm{max}} > 0。默认值是15。


参数:Criterion
a character vector giving the infromation criterion types. One of default Akaike "AIC", "AIC3", "AIC4" or "AICc",  Bayesian "BIC", consistent Akaike "CAIC", Hannan-Quinn "HQC", minimum description length "MDL2" or "MDL5",  approximate weight of evidence "AWE", classification likelihood "CLC",  integrated classification likelihood "ICL" or "ICL-BIC", partition coefficient "PC",  total of positive relative deviations "D" or sum of squares error "SSE".  
字符向量给标准类型的infromation。的默认赤池"AIC","AIC3","AIC4"或"AICc",贝叶斯"BIC",一致的赤池"CAIC",汉南昆"HQC" "MDL2",分类的可能性"MDL5",集成分类的可能性,最小描述长度"AWE"或"CLC",重量大约证据"ICL"或"ICL-BIC",分配系数"PC",总的正面相对偏差"D"或总和的平方误差"SSE"。


参数:Variables
a character vector of length d containing types of variables. One of "continuous" or "discrete".  
字符向量的长度d包含不同类型的变量。一个"continuous"或"discrete"。


参数:pdf
a character vector of length d containing continuous or discrete parametric family types. One of "normal", "lognormal", "Weibull", "gamma", "binomial", "Poisson" or "Dirac".  
字符向量的长度d包含连续或离散参数类型的家庭。 "normal","lognormal","Weibull","gamma","binomial","Poisson"或"Dirac"之一。


参数:Theta1
a vector of length d containing initial component parameters. One of n_{il} = \textrm{Number of categories} - 1 for "binomial" distribution or "NA" otherwise.   
一个向量的长度d含初始元件的参数。之一n_{il} = \textrm{Number of categories} - 1"binomial"分配或"NA"否则。


参数:Theta2
a vector of length d containing initial component parameters. The value is NULL.  
一个向量的长度d含初始元件的参数。的值NULL。


参数:K
a vector or a list of vectors containing numbers of bins v for the histogram and the Parzen window or numbers of nearest neighbours k for the k-nearest neighbour. There is no genuine rule to identify v or k. Consequently, the REBMIX algorithm identifies them from the set K of input values by minimizing the information criterion. The Sturges rule v = 1 + \mathrm{log_{2}}(n), \mathrm{Log}_{10} rule v = 10 \mathrm{log_{10}}(n) or RootN  rule v = 2 √{n} can be applied to estimate the limiting numbers of bins or the rule of thumb k = √{n} to guess the intermediate number of nearest neighbours.  
一个向量或向量的列表包含数字的垃圾桶v为直方图和最近的邻居的的Parzen窗或数字kk-近邻。有没有真正的规则,以确定v或k。因此,REBMIX的算法识别的集合K的输入值的最小化信息标准。斯特奇斯规则v = 1 + \mathrm{log_{2}}(n),\mathrm{Log}_{10}规则v = 10 \mathrm{log_{10}}(n)或RootN规则v = 2 √{n}可以应用于估计箱或经验法则的限制k = √{n}猜测最近的邻居的中间值。


参数:ymin
a vector of length d containing minimum observations. The default value is NULL.  
的向量长度d最低观察。默认值是NULL。


参数:ymax
a vector of length d containing maximum observations. The default value is NULL.  
的向量长度d包含最大观测。默认值是NULL。


参数:b
minimum weight multiplier 0 ≤q b ≤q 1 influences the number of components w_{\mathrm{min}} = 2 D_{\mathrm{min}} ((l - 1)b + 1). The default value is 1.0.  
最小重量乘法器0 ≤q b ≤q 1影响组件w_{\mathrm{min}} = 2 D_{\mathrm{min}} ((l - 1)b + 1)的数量。默认值是1.0。


参数:ar
acceleration rate 0 < a_{\mathrm{r}} &le;q 1. The default value is 0.1 and in most cases does not have to be altered.  
加速率0 < a_{\mathrm{r}} &le;q 1。默认值0.1和在大多数情况下,不以被改变。


参数:Restraints
a character string giving the restraints type. One of "rigid" or default "loose". The rigid restraints are obsolete and applicable for well separated components only.   
一个字符串的限制类型。 "rigid"或默认"loose"之一。已经过时,并且很好的分离组件只适用于刚性的约束。


值----------Value----------


参数:Dataset
a list of data frames of size n \times d containing d-dimensional datasets. Each of the d columns represents one random variable. Number of observations n equals the number of rows in the datasets.  
数据框的大小n \times d含有D三维数据集的列表。 d列中的每一个代表一个随机变量。观测n数等于行的数据集。


参数:w
a list of data frames each containing c component weights w_{l} summing to 1.  
每粒含c组件重量w_{l}总结为1的数据框的列表。


参数:Theta
a list of data frames each containing  c parametric family types pdfi. One of "normal", "lognormal", "Weibull", "gamma", "binomial", "Poisson" or "Dirac". Component parameters theta1.i follow the parametric family types. One of &mu;_{il} for normal and lognormal distributions and &theta;_{il} for Weibull, gamma, binomial, Poisson and Dirac distributions. Component parameters theta2.i follow theta1.i. One of &sigma;_{il} for normal and lognormal distributions, &beta;_{il} for Weibull and gamma distributions and p_{il} for binomial distribution.  
一个数据框的列表每片含c参数类型的家庭pdfi。 "normal","lognormal","Weibull","gamma","binomial","Poisson"或"Dirac"之一。组件参数theta1.i按照参数类型的家庭。一&mu;_{il}正常的,对数正态分布和&theta;_{il}Weibull分布,二项分布,泊松分布,伽玛和狄拉克分布。组件参数theta2.i的theta1.i。 &sigma;_{il}正常现象,对数正态分布,&beta;_{il}韦伯和伽玛分布和p_{il}二项分布。


参数:Variables
a character vector containing types of variables. One of "continuous" or "discrete".  
字符向量类型的变量。一个"continuous"或"discrete"。


参数:pdf
a character vector containing continuous or discrete parametric family types. One of "normal", "lognormal", "Weibull", "gamma", "binomial", "Poisson" or "Dirac".  
字符向量含参数连续或离散的家庭类型。 "normal","lognormal","Weibull","gamma","binomial","Poisson"或"Dirac"之一。


参数:Theta1
a vector containing initial component parameters. One of n_{il} = \textrm{Number of categories} - 1 for "binomial" distribution or "NA" otherwise.   
一个向量,包含初始元件的参数。之一n_{il} = \textrm{Number of categories} - 1"binomial"分配或"NA"否则。


参数:Theta2
a vector containing initial component parameters. The value is NULL.  
一个向量,包含初始元件的参数。的值NULL。


参数:summary
a data frame with additional information about dataset, preprocessing, D, c_{\mathrm{max}}, information criterion type,  b, a_{\mathrm{r}}, restraints type, optimal c, optimal v or k, y_{i0}, optimal h_{i}, information criterion \mathrm{IC}, log likelihood \mathrm{log}\, L and degrees of freedom M.  
一个数据框的附加信息数据集,预处理,D,c_{\mathrm{max}},信息标准型,b,a_{\mathrm{r}},限制类型,最佳c,最佳v或k,y_{i0},最佳h_{i},信息标准\mathrm{IC},对数似然\mathrm{log}\, L和程度的自由M。


参数:pos
position in the summary data frame at which log likelihood \mathrm{log}\, L attains its maximum.  
的位置summary数据框的对数似然\mathrm{log}\, L达到其最大的。


参数:all.Imax
a list of all numbers of iterations.  
迭代的所有号码的列表。


参数:all.c
a list of all numbers of components.  
组件的所有号码的列表。


参数:all.IC
a list of all information criteria.  
列表中的所有信息标准。


参数:all.logL
a list of all log lekelihoods.  
一个列表中的所有loglekelihoods的。


参数:all.D
a list of all totals of positive relative deviations.  
列表中的所有总量积极的相对标准偏差。


参考文献----------References----------

21, 65-66.<br><br> Nagode M, Fajdiga M (1998). A General Multi-Modal Probability Density Function Suitable for  the Rainflow Ranges of Stationary Random Processes. International Journal of Fatigue, 20, 211-223.<br><br> Nagode M, Fajdiga M (2000). An Improved Algorithm for Parameter Estimation Suitable for Mixed Weibull Distributions. International Journal of Fatigue, 22, 75-80.<br><br> Nagode M, Klemenc J, Fajdiga M (2001). Parametric Modelling and Scatter Prediction of Rainflow Matrices. International Journal of Fatigue, 23, 525-532.<br><br> Nagode M, Fajdiga M (2006). An Alternative Perspective on the Mixture Estimation Problem.  Reliability Engineering &amp; System Safety, 91, 388-397.<br><br> Nagode M, Fajdiga M (2011). The REBMIX Algorithm for the Univariate Finite Mixture Estimation. Communications in Statistics - Theory and Methods, 40(5), 876-892.<br><br> Nagode M, Fajdiga M (2011). The REBMIX Algorithm for the Multivariate Finite Mixture Estimation. Communications in Statistics - Theory and Methods. 40(11), 2022-2034.

实例----------Examples----------


## Generate the complex 1 dataset.[#生成复杂的数据集。]

n <- c(998, 263, 1086, 487, 213, 1076, 232,
  784, 840, 461, 773, 24, 811, 1091, 861)

Theta <- rbind(pdf = "normal",
  theta1 = c(688.4, 265.1, 30.8, 934, 561.6, 854.9, 883.7,
  758.3, 189.3, 919.3, 98, 143, 202.5, 628, 977),
  theta2 = c(12.4, 14.6, 14.8, 8.4, 11.7, 9.2, 6.3, 10.2,
  9.5, 8.1, 14.7, 11.7, 7.4, 10.1, 14.6))

complex1 <- RNGMIX(Dataset = "complex1",
  rseed = -1,
  n = n,
  Theta = Theta)
  
complex1

complex1$Dataset[[1]][1:20, ]  

## Estimate number of components, component weights and component parameters. [#估价组件,组件的重量和元件参数的数目。]

v &lt;- c(as.integer(1 + log2(sum(n))), ## Minimum v follows the Sturges rule.[#最低的V如下的斯特奇斯规则。]
  as.integer(2 * sum(n)^0.5)) ## Maximum v follows the RootN rule.[#最大的V如下的RootN的规则。]

## Number of classes or nearest neighbours to be processed.[#类或最近的邻居进行处理。]

N <- as.integer(log(v[2] / (v[1] + 1)) / log(1 + 1 / v[1]))

K <- c(v[1], as.integer((v[1] + 1) * (1 + 1 / v[1])^(0:N)))

complex1est <- REBMIX(Dataset = complex1$Dataset,
  Preprocessing = "histogram",
  D = 0.0025,
  cmax = 30,
  Criterion = "BIC",
  Variables = "continuous",
  pdf = "normal",
  K = K,
  b = 0.0)
                 
complex1est

## Plot the finite mixture.[#图有限的混合物。]

plot(complex1est, npts = 1000)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-5-13 04:41 , Processed in 0.026521 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表