R语言 SPIA包 spia()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-26 14:52:36

spia(SPIA)
spia()所属R语言包：SPIA

                                    Signaling Pathway Impact Analysis (SPIA) based on over-representation and signaling perturbations accumulation
                                       基于以上的代表性和信号干扰积累的信号通路的影响分析（SPIA）

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

This function implements the SPIA algorithm to analyze KEGG signaling pathways.
此功能实现的的SPIA算法分析KEGG信号通路。

用法----------Usage----------

spia(de=NULL,all=NULL,organism="hsa",pathids=NULL,nB=2000,plots=FALSE,verbose=TRUE,beta=NULL,combine="fisher")

参数----------Arguments----------

参数：de
A named vector containing log2 fold-changes of the differentially expressed genes. The names of this numeric vector are Entrez gene IDs.
命名向量的log2倍的差异表达基因的变化。这个数字向量的名称是Entrez基因标识。

参数：all
A vector with the Entrez IDs in the reference set. If the data was obtained from a microarray experiment,  this set will contain all genes present on the specific array used for the experiment. This vector should contain all names of the de argument.
参考集的一个向量与Entrez的标识。如果从芯片实验获得的数据，本集将包含所有基因的实验中使用的特定阵列上。这个向量应该包含de参数的所有名称。

参数：organism
A three letter character designating the organism. See a full list at ftp://ftp.genome.jp/pub/kegg/xml/organisms.
一个三个字母的字符指定的有机体。在ftp://ftp.genome.jp/pub/kegg/xml/organisms看到一个完整的列表。

参数：pathids
A character vector with the names of the pathways to be analyzed. If left NULL all pathways available will be tested.
特征向量与通路的名称来进行分析。如果保留为空，将测试可用的所有途径。

参数：nB
Number of bootstrap iterations used to compute the P PERT value. Should be larger than 100. A recommended value is 2000.
引导迭代用于计算的P PERT的值的数目。应该大于100。建议值是2000。

参数：plots
If set to TRUE, the function plots the gene perturbation accumulation vs log2 fold  change for every gene on each pathway. The null distribution of the total net accumulations from which PPERT is computed, is plotted as well. The figures are sent to the SPIAPerturbationPlots.pdf file in the current directory.
如果设置为TRUE，函数绘制基因扰动的积累比的log2倍，每个通路的每一个基因的变化。从PPERT计算总净积累，空分布，以及绘制。数字发送到SPIAPerturbationPlots.pdf文件在当前目录。

参数：verbose
If set to TRUE, displays the number of pathways already analyzed.
如果设置为TRUE时，显示已经分析的途径。

参数：beta
Weights to be assigned to each type of gene/protein relation type. It should be a named numeric vector of length 23, whose names must be: c("activation","compound","binding/association","expression","inhibition","activation_phosphorylation","phosphorylation", "indirect","inhibition_phosphorylation","dephosphorylation_inhibition","dissociation","dephosphorylation","activation_dephosphorylation", "state","activation_indirect","inhibition_ubiquination","ubiquination","expression_indirect","indirect_inhibition","repression", "binding/association_phosphorylation","dissociation_phosphorylation","indirect_phosphorylation")  If set to null, beta will be by default chosen as: c(1,0,0,1,-1,1,0,0,-1,-1,0,0,1,0,1,-1,0,1,-1,-1,0,0,0).
权重被分配到每个类型的基因/蛋白的关系类型。它应该是一个名为数字矢量长度为23，其名称必须是：c("activation","compound","binding/association","expression","inhibition","activation_phosphorylation","phosphorylation", "indirect","inhibition_phosphorylation","dephosphorylation_inhibition","dissociation","dephosphorylation","activation_dephosphorylation", "state","activation_indirect","inhibition_ubiquination","ubiquination","expression_indirect","indirect_inhibition","repression", "binding/association_phosphorylation","dissociation_phosphorylation","indirect_phosphorylation")如果设置为null，测试将是默认选择为：C（1,0,0,1，-1,1,0 0，-1，-1,0,0,1,0,1，-1,0,1，-1，-1,0,0,0）。

参数：combine
Method used to combine the two types of p-values. If set to "fisher" it will use Fisher's method. If set to "norminv" it will use the normal inversion method.
使用的方法结合两种类型的p值。如果设置"fisher"它将使用费希尔的方法。如果设置"norminv"将使用正常的反演方法。

Details

详情----------Details----------

See cited documents for more details.
详情请参阅引用文件。

值----------Value----------

A data frame containing the ranked pathways and various statistics: pSize is the number of genes on the pathway; NDE is the number of DE genes per pathway; tA is the observed total preturbation  accumulation in the pathway; pNDE is the probability to observe at least NDE genes on the pathway using a hypergeometric model; pPERT is the probability to observe a total accumulation more extreme than tA only by  chance; pG is the p-value obtained by combining pNDE and pPERT; pGFdr and pGFWER are the False Discovery Rate and respectively Bonferroni adjusted global p-values; and the Status gives the direction  in which the pathway is perturbed (activated or inhibited).  KEGGLINK gives a web link to the KEGG website that displays the pathway image with the differentially expressed genes  highlighted in red.
一个数据框包含排名的途径和各种统计数据：pSize的途径上的基因数量; NDE是每个通路的基因数目; tA是观测到的的总preturbation积累途径;pNDE的概率是观察至少NDE通路上的基因，用超几何模型;pPERT的概率比tA更为极端观察总积累只是一个偶然的机会，“pGp值相结合，取得pNDE和pPERT;pGFdr和pGFWER是假发现率分别邦弗朗尼调整全球的p值;Status给人的方向，途径扰动（激活或抑制）。 KEGGLINK给出了一个网页链接到KEGG网站以红色突出显示的差异表达基因的途径图像显示。

作者（S）----------Author(s)----------

Adi Laurentiu Tarca <atarca@med.wayne.edu>, Purvesh Khatri, Sorin Draghici

参考文献----------References----------

Microarray Experiments, 2008, Bioinformatics, 2009, 25(1):75-82. <br>
approach for the steady-state analysis of gene signaling networks. Progress in Pattern Recognition, Image Analysis and Applications, Lecture Notes in Computer Science. 4756:32-41, November 2007. <br>
A systems biology approach for pathway level analysis. Genome Research, 17, 2007. <br>

参见----------See Also----------

plotP
plotP

举例----------Examples----------

# Example using a colorectal cancer dataset obtained using Affymetrix geneChip technology (GEE GSE4107).[例如，使用一个大肠癌的数据集获得使用Affymetrix基因芯片技术（GSE4107，GEE）。]
# Suppose that proper preprocessing was performed and a two group moderated t-test was applied. The topTable [假设，进行适当的预处理和两个组的主持应用t检验。在topTable]
# result from limma package for this data set is called "top".[结果从limma这组数据的包被称为“顶”。]
#The following lines will annotate each probeset to an entrez ID identifier, will keep the most significant probeset for each [以下线将注释每个probeset到一个Entrez的ID标识符，将继续为每一个最显着的probeset]
#gene ID and retain those with FDR<0.05 as differentially expressed.[基因ID和保留这些差异表达的FDR<0.05。]
#You can run these lines if hgu133plus2.db package is available[你可以运行这些线如果hgu133plus2.db包是可用]

#data(colorectalcancer)[数据（结直肠癌）]
#x <- hgu133plus2ENTREZID [X < -  hgu133plus2ENTREZID]
#top$ENTREZ<-unlist(as.list(x[top$ID]))[美元Entrez的<不公开（as.list（X [顶部美元的ID]））]
#top<-top[!is.na(top$ENTREZ),][顶部<顶！is.na（美元Entrez的），]]
#top<-top[!duplicated(top$ENTREZ),][顶部<顶端的[！（美元Entrez的），重复]]
#tg1<-top[top$adj.P.Val<0.1,][TG1 <[顶adj.P.Val <0.1]]
#DE_Colorectal=tg1$logFC[DE_Colorectal = TG1美元logFC]
#names(DE_Colorectal)<-as.vector(tg1$ENTREZ)[<as.vector（TG1美元Entrez的名称（DE_Colorectal））]
#ALL_Colorectal=top$ENTREZ[ALL_Colorectal = $ Entrez的]

data(colorectalcancer)

# pathway analysis using SPIA; # use nB=2000 or higher for more accurate results[通路分析使用SPIA;＃使用NB = 2000或更高更准确的结果]
res<-spia(de=DE_Colorectal, all=ALL_Colorectal, organism="hsa",beta=NULL,nB=2000,plots=FALSE, verbose=TRUE,combine="fisher")
res
# Create the evidence plot[创建证据图]
plotP(res)

#now combine pNDE and pPERT using the normal inversion method without running spia function again[现在结合pNDE和pPERT使用无需再次运行SPIA功能正常的反演方法]
res$pG=combfunc(res$pNDE,res$pPERT,combine="norminv")
res$pGFdr=p.adjust(res$pG,"fdr")
res$pGFWER=p.adjust(res$pG,"bonferroni")
plotP(res,threshold=0.05)
#highlight the colorectal cancer pathway in green[突出绿色的大肠癌途径]
points(I(-log(pPERT))~I(-log(pNDE)),data=res[res$ID=="05210",],col="green",pch=19,cex=1.5)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册