博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
R-Description Data(step 3)
阅读量:6225 次
发布时间:2019-06-21

本文共 6127 字,大约阅读时间需要 20 分钟。

[I]Description

1.overall description

summary()

> summary(data)        #min,lower quartile,median,upper quartile,max

sapply()

> sapply(x,FUN,options)        #mean,standard deviation,skewness,kurtosis#options:mean(),sd(),var(),min(),max(),median(),length(),range(),quantile(),fivenum()

describe() of Hmisc

> describe(data)        #variable and observation amount,missing value and unique value mean,                        #quantile,,five min,five max

stat.desc() of pastecs

> stat.desc(data)#- basic=TRUE(default)#variable,null value,missing value,min,max,range,summary#- desc=TRUE(default)#median,mean,mean standard deviation,mean confidence interval(confidence=95%)#- norm=TRUE#normal distribution,include skewness and kurtosis(and degree of statistics)

describe() of psych

> describe(data)        #missing value mean,standard deviation,madian,trimmed mean,                        #median absolute deviation,min,max,range,skewness,kurtosis,standard error of the mean

2.part description

aggregate()

> aggregate(data,by=list(INDICES),FUN)        #return single statistic

by()

> by(data,INDICES,FUN)        #return multiple statistics

summaryBy() of doBy

> summaryBy(formula,data=dataframe,FUN=function)          #single or multiple grouping variable layering#formula = var1 + var2 + var3 + ... + varN ~ groupvar1 + groupvar2 + ... + groupvarN#(varN is numerical variable,groupvar is grouping variable)

describeBy() of psych

> describeBy(data,list(INDICES))        #grouping variable are related

3.contingency table

traditional

Function Describe
table(var1,var2, ... ,varN) N dimensional table
xtabs(~formula,data) N dimensional table is based on a formula,a matrix or data frame generating
prop.table(table,margins) convert frequency to scale
margin.table(table,margin) summary
addmargins(table,margins) add margins to table
ftable(table) tiled contingency table

CrossTable() of gmodels

> CrossTable(data1,data2)

[II]Test

1.known sample

- independence

Chi-square
> chis.test(data)        #p<0.01,related;p>0.05,unrelated
Fisher percision
> fisher.test(mytable)        #mytable is not a 2×2 table
Cochran-Mantel-Haenszel
> mantelhaen.test(mytable)        #no third-order interaction

- correlation

category type

(1)Phi/Contingency/Cramer's V

> assocstats(mytable)

(2)Pearson/Spearman/Kendall

> cor(x,use,method)        #default:use="everything",method="pearson"> cov(data)        #covariance> cor.test(x,y,alternative= ,method= )        #test a relationship at a time> corr.test(x,use,method)        #test multiple relationships at a time

use:

  • all.obs:getting an error while getting wrong data;
  • everything:missing is setting while missing data;
  • complete.obs:line deletion
  • pairwise.complete.obs:pairwise deletion

method:

  • pearson:linear correlation between two variables
  • spearman:degree of correlation between graded variables
  • kendall:level related measure
    (3)partial correlation
> library(ggm)> pcor(u,S)        #u:numerical vetor;S:covariance> pcor.test(r,q,n)        #r:correlation coefficient;q:variable number;n:sample size

continuous type

(1)parameter

1)independent sample

> t.test(y~x,data)        #t.test(y1,y2)

2)dependent sample

> t.test(y1,y2,paired=TRUE)

3)more than two groups:ANOVA

  • single factor varinance (y~A)
> aov(formula,data=dataframe)> TukeyHSD()        #pairwise comparison
  • single factor covariance (y~x + A)
  • double factors varinance (y~A * B)
  • repeated measurement varinance (y~ B*W + Error(Subject/W))
  • multiple varinance
> data->manova(y~A)> summary.aov(data)> Wilks.test(y,shelf,method="mcd")
  • regression
> fit.lm<-lm(y~A,data)> summary(fit.lm)

(2)nonparameter

  • two groups
> wilcox.test(y~x,data)        #wilco.text(y1,y2)
  • more than two groups
#groups independent> kruskal.test(y~A,data)        #groups dependent>friedman.test(y~A|B,data)

2.random sample

Function Description
oneway_test(y~A) two samples and K samples
oneway_test(y~A | C) containing a layering factor of two samples and K samples
wilcox_test(y~A) Wilcoxon-Meann-Whitney
kruskal_test(y~A) Kruskal-Wallis
chisq_test(A~B | C) Pearson Chi-square
cmh_test(A~B | C) Cochran-Mantel-Haenszel
lbl_test(D~E) linear correlation
spearman_test(y~x) Spearman
friendman_test(y~A | C) Friendman
wilcoxsign_test(y1~y2) Wilcoxon
  • function_name(formula,data,distribution=)
  • formula=variables relationship
  • data=dataframe
  • distribution="exact"/"asymptotic"/"approximate"
Function Description
lmp(A~B,data=,perm=) simple
lmp(A~B+I(height^2),data=,perm=) polynomical
lmp(A~B+C+D+E,data=,perm=) multiple
avop(A~B,data=,perm=) single factor variance
avop(A~B+C,data=,perm=) single factor covariance
avop(A~B*C,data=,perm=) double factor variance
  • perm="Exact"/"Prob"/"SPR"

[III]efficacy

Function Description
pwr.2p.test(h=,n=,sig.level=,power=) two(n is equal)
pwr.2p2n.test(h=,n1=,n2=,sig.level=,power=) two(n are not equal)
pwr.anova.test(k=,n=,f=,sig.level=,power=) balanced single factor ANOVA
pwr.chisq.test(w=,N=,df=,sig.level=,power=) Chi-square test
pwr.f2.test(u=,v=,f2=,sig.level=,power=) generalized linear model
pwr.p.test() proportion(single sample)
pwr.r.test(n=,r=,sig.level=,power=,alternative=) correlation coefficient
pwr.t.test(n=,d=,sig.level=,power=,type=,alternative=) t est(single sample/two samples/pair)
pwr.t2n.test(n1=,n2=,d=,sig.level=,power=,alternative=) t test(n are not equal of two samples)
  • h=ES.h(p1,p2)
  • n=sample size
  • $\mu$=mean
  • $\sigma^2$=error variance
  • sig.level=significant level(default=0.05)
  • power=efficacy level
  • k=groups number
  • f=$\sqrt{\frac{\sum_{i-1}^{k}{p_i * {(\mu_i -\mu)}^2}}{\sigma^2}}$,$p_i=\frac{n_i}{N}$
  • w=$\sqrt{\sum_{i=1}^{m}{\frac{
    {(p0_i-p1_i)}^2}{p0_i}}}$,$p0_i=H_0$ for probability,$p1_i=H_1$ for probability
  • N=total sample
  • df=free degree
  • u=N-B;N-k-1(k=forecast number)
  • v=denominator free degree
  • f2=$\frac{R^2}{1-R^2}$($R^2$=total squared value of multiple correlation);
    f2=$\frac{
    {R_{AB}}^2-{R_A}^2}{1-{R_{AB}}^2}$(${R_{A}}^2$=interpretation rate of A for total variance,${R_{AB}}^2$=interpretation rate of A and B for total variance)
  • r=reference linear correlation coefficient
  • alternative="two.sided"(default)/"less"/"greater"
  • d=$\frac{\mu_1-\mu_2}{\sigma}$
  • type="two.sample"(default)/"one.sample"/"paired"

    END!

转载地址:http://ozfna.baihongyu.com/

你可能感兴趣的文章
sql 分頁查詢
查看>>
Nginx负载均衡
查看>>
过滤求素数的好方法
查看>>
T-SQL查询进阶--理解SQL Server中索引的概念,原理以及其他
查看>>
Android UI开发第八篇——ViewFlipper 左右滑动效果
查看>>
调试js 试用火狐的firebug
查看>>
Solr使用SolrJ出现Lock obtain timed out: NativeFSLock
查看>>
解决maven模块化开发打jar包会过滤掉配置文件(xml,properties)的问题
查看>>
android中使用ViewPager实现图片左右拖动
查看>>
MVC设计模式
查看>>
JavaScript字典
查看>>
A Tour of the Dart Language(译文):三函数
查看>>
从C++到java
查看>>
05. Java NIO Scatter / Gather
查看>>
java.lang.IllegalStateException异常产生的原因及解决办法
查看>>
IOS中常用的知识总结(二)
查看>>
调用另一个Activity
查看>>
关于 Apache 的 25 个初中级面试题
查看>>
Activity那些不得不说的事
查看>>
小米生早了!!
查看>>