On Thu, May 19, 2011 at 7:21 PM, karena <

[hidden email]> wrote:

> Hi, I have a question about how to do covariate adjustment.

>

> I have two sets of 'gene expression' data. They are from two different

> tissue types, 'liver' and 'brain', respectively.

> The purpose of my analysis is to compare the pattern of the whole genome

> 'gene expression' between the two tissue types.

> I have 'age' and 'sex' as covariates. Since 'age' and 'sex' definitely have

> influence on gene expression, I need to first filter out the proporation of

> 'gene expression' attributable to 'age' and 'sex', then compare the

> 'remaining gene expression' value between 'liver' and 'brain'.

> How to do the covariate adjustment to remove the effects of these two

> covariates?

> Should I do a 'step-wise' regression or something?

> Which function in R should I use?

>

> I am new to this field, and really appreciate your help!

Go to your local library and get an introductory book on linear

regression (or linear models) that also covers basics of multi-variate

regression.

When you learn a little bit about regression, read at least the first

few chapters of Julian Faraway's book on regression in R,

cran.r-project.org/doc/contrib/Faraway-PRA.pdf

When you're done, use roughly the following command, assuming age is

is a numeric variable and expression contains the expression data with

genes in columns.

sex.num = as.numeric(as.factor(sex))

tissue.num = as.numeric(as.factor(tissue))

fit = lm(expression~sex.num + age + tissue.num)

The variable fit will be a list with one element per gene. In each

element, look at the component $coefficients[4,4], which is the

p-value for the hypothesis that the coefficient of tissue is zero.

Since you likely have thousands of genes, you will have to do a

multiple testing correction, which you could do for example using the

qvalue package in which the function qvalue calculates the false

discovery rate.

Peter

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.