# Re: find multiple mode, sorry for not providing enough information

11 messages
Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

 In reply to this post by Yuan Chun Ding Hi Ding, While I was completely off the track in my first reply, the subsequent posts make your problem somewhat clearer. The way you state the problem suggests that the order of the values of "freq" is important. That is, it is not just a matter of finding local maxima, but the direction in which you approach those maxima is important. For example. I might want to only identify maxima with at least four monotonically increasing values preceding them and a decrease of at least half the value of the maximum in the succeeding value. By breaking down the problem into a set of criteria, these can be implemented in a function that will search the values in one direction, returning the locations of maxima that fulfil those criteria. Jim On Mon, Mar 16, 2020 at 3:11 PM Yuan Chun Ding <[hidden email]> wrote: > > sorry, I just came back. > > Yes,  Abby's understanding is right. > > > tem4\$Var1 >  [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18  20   21   22    23     24   25   31 > > tem4\$Freq >  [1]   1   2   5   5  10   4   4   8   1    1    8    8    2    4    3    1    2    1     1   138  149    14    1     1 > > I have 2000 markers, this is just one example marker, the var1 is a VNTR marker with alleles 1, 3, 4 etc, a multi-allele marker; the corresponding frequency for each allele is 1,2 5 etc.  I want to convert this multi-allele marker to bi-allele markers by choosing a cutoff value; I would want the cut point to be allele 6 with frequency of 10, so allele 1 to allele 9 are considered as "short" allele, allele 10 to 31 as "long" allele;  then sliding to next rsing frequency peak, allele 8 with frequency of 8, etc. > > maybe those rising peaks are not really multiple modes, but I want to do this type of data conversion.  I want to first determine the number of modes, then convert input dat file into m different input files, then perform Cox regression analysis for each converted file. I am stuck in the step of find out m rise peaks. > > Thank you, > > Ding > ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

 In reply to this post by Yuan Chun Ding Sorry, internet's not working properly today. Third time lucky.... Here's a solution to your original question: --------- freq <- c (1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1) unique.consecutive <- function (x) {   dx <- diff (x)     x [dx != 0] } which.maxs <- function (x, ..., include.endpoints=FALSE) {   dx <- diff (x)     if (any (dx == 0) )     stop ("function needs unique-consecutive values")     ndx <- length (dx)     I <- c (FALSE, dx [-ndx] > 0 & dx [-1] < 0, FALSE)     if (include.endpoints)     {   I [1] <- (dx [1] < 0)         I [ndx + 1] <- (dx [ndx] > 0)     }     which (I) } freq.sub <- unique.consecutive (freq) maxv <- freq.sub [which.maxs (freq.sub, include.endpoints=TRUE)] maxv unique (maxv) --------- Some comments: My package, probhat, contains early prototype-quality functions for discrete kernel smoothing. This can be used to "smooth" frequency data. Which in turn, can eliminate spurious modes. https://cran.r-project.org/web/packages/probhat/vignettes/probhat.pdfUnfortunately, bandwidth selection is manual. Also note that currently it only returns probability mass (not frequency) but it's very easy to to get frequency from probability mass. I'm planning to resume work on this package in two to three days, so I'm open to suggestions...         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: find multiple mode, sorry for not providing enough information

 I think I need a different email. Google is making it difficult to sent/receive/read completely plain text messages. On my end, it's automatically formatting plain text messages, and doing so, incorrectly. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.