|
I'm trying to analyze the following data set (sample):
"ID" "adj.P.Val" "logFC" "Gene.symbol" "1419156_at" "5.32e-12" "2.6462565" "Sox4" "1433575_at" "5.32e-12" "3.9417089" "Sox4" "1428942_at" "2.64e-11" "3.9163618" "Mt2" "1454699_at" "2.69e-10" "1.8654677" "LOC100047324///Sesn1" "1416926_at" "3.19e-10" "2.172342" "Trp53inp1" "1422557_s_at" "1.58e-09" "2.9569254" "Mt1" etc. using the following code: muscle = read.table(file="/Users/bob/Desktop/Muscle/musclesmall.txt", header = TRUE, colClasses = "character", fill = TRUE) upregulated_list = c() downregulated_list = c() nochange = c() p_thresh = 6.51e-06 x=1 while (x <= nrow(muscle)) { this_pval = muscle[x,"adj.P.Val"] this_M = muscle[x, "logFC"] if (muscle[x, "Gene.symbol"] == "") { x= x +1 } else {if ((this_M >= 1.0) & (this_pval <= p_thresh)) { upregulated_list <- append(upregulated_list, muscle[x,"Gene.symbol"],after=length(upregulated_list)) x = x +1} else {if ((this_M <= -1) & (this_pval <= p_thresh)) { downregulated_list <- append(downregulated_list, muscle[x,"Gene.symbol"],after=length(downregulated_list)) x = x+1 } else {if ((this_M > -1) & (this_M < 1)) { nochange <- append(nochange, muscle[x,"Gene.symbol"],after=length(nochange)) x = x+1} } } } } This process, however, goes line-by-line and the data has 22,000 rows, so running the process takes an enormous amount of time. Is there any way for me to do the analysis faster? |
|
Hello,
The trick is to use index vectors. They allow us to do without loops. Try the following. muscle <- read.table(text=' "ID" "adj.P.Val" "logFC" "Gene.symbol" "1419156_at" "5.32e-12" "2.6462565" "Sox4" "1433575_at" "5.32e-12" "3.9417089" "Sox4" "1428942_at" "2.64e-11" "3.9163618" "Mt2" "1454699_at" "2.69e-10" "1.8654677" "LOC100047324///Sesn1" "1416926_at" "3.19e-10" "2.172342" "Trp53inp1" "1422557_s_at" "1.58e-09" "2.9569254" "Mt1" ', header=TRUE, stringsAsFactors=FALSE) muscle p_thresh = 6.51e-06 # Create index vectors gsym <- muscle$Gene.symbol != "" this_pval <- muscle$adj.P.Val <= p_thresh this_Ma <- muscle$logFC > -1 this_Mb <- muscle$logFC < 1 # Use them downregulated_list <- muscle$Gene.symbol[gsym & !this_Ma & this_pval] upregulated_list <- muscle$Gene.symbol[gsym & !this_Mb & this_pval] nochange <- muscle$Gene.symbol[gsym & this_Ma & this_Mb] # See the result [ Maybe with head() ] upregulated_list downregulated_list nochange Hope this helps, Rui Barradas Em 12-06-2012 21:55, mousy0815 escreveu: > upregulated_list = c() > downregulated_list = c() > nochange = c() > p_thresh = 6.51e-06 > x=1 > > while (x <= nrow(muscle)) { > this_pval = muscle[x,"adj.P.Val"] > this_M = muscle[x, "logFC"] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Another alternative is to use
?subset HTH, Jorge.- On Tue, Jun 12, 2012 at 6:06 PM, Rui Barradas <> wrote: > Hello, > > The trick is to use index vectors. They allow us to do without loops. > > Try the following. > > > muscle <- read.table(text=' > > "ID" "adj.P.Val" "logFC" "Gene.symbol" > "1419156_at" "5.32e-12" "2.6462565" "Sox4" > "1433575_at" "5.32e-12" "3.9417089" "Sox4" > "1428942_at" "2.64e-11" "3.9163618" "Mt2" > "1454699_at" "2.69e-10" "1.8654677" "LOC100047324///Sesn1" > "1416926_at" "3.19e-10" "2.172342" "Trp53inp1" > "1422557_s_at" "1.58e-09" "2.9569254" "Mt1" > ', header=TRUE, stringsAsFactors=FALSE) > > muscle > > p_thresh = 6.51e-06 > > # Create index vectors > gsym <- muscle$Gene.symbol != "" > this_pval <- muscle$adj.P.Val <= p_thresh > this_Ma <- muscle$logFC > -1 > this_Mb <- muscle$logFC < 1 > > # Use them > downregulated_list <- muscle$Gene.symbol[gsym & !this_Ma & this_pval] > upregulated_list <- muscle$Gene.symbol[gsym & !this_Mb & this_pval] > nochange <- muscle$Gene.symbol[gsym & this_Ma & this_Mb] > > # See the result [ Maybe with head() ] > upregulated_list > downregulated_list > nochange > > > Hope this helps, > > Rui Barradas > Em 12-06-2012 21:55, mousy0815 escreveu: > > upregulated_list = c() >> downregulated_list = c() >> nochange = c() >> p_thresh = 6.51e-06 >> x=1 >> >> while (x <= nrow(muscle)) { >> this_pval = muscle[x,"adj.P.Val"] >> this_M = muscle[x, "logFC"] >> > > ______________________________**________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
