# Request for functions to calculate correlated factors influencing an outcome.

5 messages
Open this post in threaded view
|

## Request for functions to calculate correlated factors influencing an outcome.

 Hi I am sorry, I saved the file removing the dot after the Disp (as I was going wrong on a read.delim which threw an error about !header, etc...The dot was not the culprit, but I continued to leave it out. Let me paste the full code here. x<-read.table("/Users/Documents/StatsTest/fuelEfficiency.txt", header=TRUE, sep="\t") x<-data.frame(x) for (i in unique(x\$Country)) { print (i); y <- subset(x, x\$Country == i); print(y); } newx <- subset (x, select = c(Price, Reliability, Mileage, Weight, Disp, HP)) cor(newx, method="pearson") my.cor <-cor.test(newx\$Weight, newx\$Price, method="spearman") my.cor <-cor.test(newx\$Weight, newx\$HP, method="spearman") my.cor <-cor.test(newx\$Disp, newx\$HP, method="spearman") Putting exact=NULL still doesn't remove the warning my.cor <-cor.test(newx\$Disp, newx\$HP, method="kendall", exact=NULL) I tried to find the correlation coeff for a various combination of variables, but am unable to interpet the results. (Results pasted below in an earlier post) Followed it up with a normality test shapiro.test(newx\$Disp) shapiro.test(newx\$HP) Then decided to do a kruskal.test(newx) with the result Kruskal-Wallis chi-squared = 328.94, df = 5, p-value < 2.2e-16 Question is : I am trying to find factors influencing efficiency (in this case mileage) What are the range of functions / examples I should be looking at, to find a factor or combination of factors influencing efficiency? Any pointers will be helpful Thanks Lalitha On Sun, May 3, 2015 at 2:49 PM, Lalitha Viswanathan < [hidden email]> wrote: > Hi > I have a dataset of the type attached. > Here's my code thus far. > dataset <-data.frame(read.delim("data", sep="\t", header=TRUE)); > newData<-subset(dataset, select = c(Price, Reliability, Mileage, Weight, > Disp, HP)); > cor(newData, method="pearson"); > Results are >                  Price Reliability    Mileage     Weight       Disp >   HP > Price        1.0000000          NA -0.6537541  0.7017999  0.4856769 >  0.6536433 > Reliability         NA           1         NA         NA         NA >   NA > Mileage     -0.6537541          NA  1.0000000 -0.8478541 -0.6931928 > -0.6667146 > Weight       0.7017999          NA -0.8478541  1.0000000  0.8032804 >  0.7629322 > Disp         0.4856769          NA -0.6931928  0.8032804  1.0000000 >  0.8181881 > HP           0.6536433          NA -0.6667146  0.7629322  0.8181881 >  1.0000000 > > It appears that Wt and Price, Wt and Disp, Wt and HP, Disp and HP, HP and > Price are strongly correlated. > To find the statistical significance, > I am trying  sample.correln<-cor.test(newData\$Disp, newData\$HP, > method="kendall", exact=NULL) > Kendall's rank correlation tau > > data:  newx\$Disp and newx\$HP > z = 7.2192, p-value = 5.229e-13 > alternative hypothesis: true tau is not equal to 0 > sample estimates: >       tau > 0.6563871 > > If I try the same with > sample.correln<-cor.test(newData\$Disp, newData\$HP, method="pearson", > exact=NULL) > I get Warning message: > In cor.test.default(newx\$Disp, newx\$HP, method = "spearman", exact = NULL) > : >   Cannot compute exact p-value with ties > > sample.correln > > Spearman's rank correlation rho > > data:  newx\$Disp and newx\$HP > S = 5716.8, p-value < 2.2e-16 > alternative hypothesis: true rho is not equal to 0 > sample estimates: >       rho > 0.8411566 > > I am not sure how to interpret these values. > Basically, I am trying to figure out which combination of factors > influences efficiency. > > Thanks > Lalitha >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Request for functions to calculate correlated factors influencing an outcome.

Open this post in threaded view
|