Re: find multiple mode, sorry for not providing enough information

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Yuan Chun Ding
sorry, I just came back.

Yes,  Abby's understanding is right.  

> tem4$Var1
 [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18  20   21   22    23     24   25   31
> tem4$Freq
 [1]   1   2   5   5  10   4   4   8   1    1    8    8    2    4    3    1    2    1     1   138  149    14    1     1

I have 2000 markers, this is just one example marker, the var1 is a VNTR marker with alleles 1, 3, 4 etc, a multi-allele marker; the corresponding frequency for each allele is 1,2 5 etc.  I want to convert this multi-allele marker to bi-allele markers by choosing a cutoff value; I would want the cut point to be allele 6 with frequency of 10, so allele 1 to allele 9 are considered as "short" allele, allele 10 to 31 as "long" allele;  then sliding to next rsing frequency peak, allele 8 with frequency of 8, etc.

maybe those rising peaks are not really multiple modes, but I want to do this type of data conversion.  I want to first determine the number of modes, then convert input dat file into m different input files, then perform Cox regression analysis for each converted file. I am stuck in the step of find out m rise peaks.

Thank you,

Ding  

   tem <- as.data.frame(t(dat[i,,drop=F]))
  names(tem)<-"V1"
  tem <- tem[which(tem$V1!=""),,drop=F]
  tem2 <-separate(tem, col=V1, into=c("m1","m2"), convert = T)
  tem3 <-gather(tem2, marker, VNTR_repeats, m1:m2)
  tem4 <-as.data.frame(t(t(table(tem3$VNTR_repeats))))[,c(1,3)]
  tem4$Var1 <-as.numeric(as.character(tem4$Var1))
  tem4 <-tem4[order(tem4$Var1),]
  m<-
________________________________________
From: Abby Spurdle [[hidden email]]
Sent: Sunday, March 15, 2020 3:42 PM
To: Jim Lemon
Cc: Yuan Chun Ding; r-help mailing list
Subject: Re: [R] find multiple mode

I think people have misinterpreted the question.
The OP wants local maxima from the series.

The original series is frequencies, so your table is frequencies of frequencies.

A solution can be derived by looking at signs of the first and second
differences.
But there may be a simpler way????

On Mon, Mar 16, 2020 at 10:24 AM Jim Lemon <[hidden email]> wrote:

>
> Hi Ding,
> Translating this into R code:
>
> freq<-c(1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)
> > table(freq)
> freq
>  1   2   3   4   5   8  10  14 138 149
>  8   3   1   3   2   3   1   1   1   1
> > library(prettyR)
> > Mode(freq)
> [1] "1"
>
> You have a single modal value (1). If there were at most two ones, you
> would have three values (2,4,8) that could be considered multiple
> modes. What you seem to be doing is considering values that are not
> separated by commas as modes. Perhaps this is a formatting problem
> with your email.
>
> Jim
>
> On Mon, Mar 16, 2020 at 7:55 AM Yuan Chun Ding <[hidden email]> wrote:
> >
> > Hi R users,
> >
> > I want to find multiple modes (10, 8, 149) for the following vector.
> >
> > freq =1,2,5,5  10,4,4,8,1,1,8,8,2,4,3,1,2,1,1 138 149  14,1,1;
> >
> > any suggestion?
> >
> > Thank you,
> >
> > Ding
> >
> > ----------------------------------------------------------------------
> > ------------------------------------------------------------
> > -SECURITY/CONFIDENTIALITY WARNING-
> >
> > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to r
>  ec
> >  eive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301)
> > ------------------------------------------------------------
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
> > PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
> PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Bert Gunter-2
You **might** do better pursuing this sort of thing on the Bioconductor
site:
https://www.bioconductor.org/help/
They often have professionally written R packages tailored for genomics so
that you don't need to shake and bake your own with all the dangers that
entails (not least of which may be that your methodology is suspect).

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Mar 15, 2020 at 9:11 PM Yuan Chun Ding <[hidden email]> wrote:

> sorry, I just came back.
>
> Yes,  Abby's understanding is right.
>
> > tem4$Var1
>  [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18
> 20   21   22    23     24   25   31
> > tem4$Freq
>  [1]   1   2   5   5  10   4   4   8   1    1    8    8    2    4    3
> 1    2    1     1   138  149    14    1     1
>
> I have 2000 markers, this is just one example marker, the var1 is a VNTR
> marker with alleles 1, 3, 4 etc, a multi-allele marker; the corresponding
> frequency for each allele is 1,2 5 etc.  I want to convert this
> multi-allele marker to bi-allele markers by choosing a cutoff value; I
> would want the cut point to be allele 6 with frequency of 10, so allele 1
> to allele 9 are considered as "short" allele, allele 10 to 31 as "long"
> allele;  then sliding to next rsing frequency peak, allele 8 with frequency
> of 8, etc.
>
> maybe those rising peaks are not really multiple modes, but I want to do
> this type of data conversion.  I want to first determine the number of
> modes, then convert input dat file into m different input files, then
> perform Cox regression analysis for each converted file. I am stuck in the
> step of find out m rise peaks.
>
> Thank you,
>
> Ding
>
>    tem <- as.data.frame(t(dat[i,,drop=F]))
>   names(tem)<-"V1"
>   tem <- tem[which(tem$V1!=""),,drop=F]
>   tem2 <-separate(tem, col=V1, into=c("m1","m2"), convert = T)
>   tem3 <-gather(tem2, marker, VNTR_repeats, m1:m2)
>   tem4 <-as.data.frame(t(t(table(tem3$VNTR_repeats))))[,c(1,3)]
>   tem4$Var1 <-as.numeric(as.character(tem4$Var1))
>   tem4 <-tem4[order(tem4$Var1),]
>   m<-
> ________________________________________
> From: Abby Spurdle [[hidden email]]
> Sent: Sunday, March 15, 2020 3:42 PM
> To: Jim Lemon
> Cc: Yuan Chun Ding; r-help mailing list
> Subject: Re: [R] find multiple mode
>
> I think people have misinterpreted the question.
> The OP wants local maxima from the series.
>
> The original series is frequencies, so your table is frequencies of
> frequencies.
>
> A solution can be derived by looking at signs of the first and second
> differences.
> But there may be a simpler way????
>
> On Mon, Mar 16, 2020 at 10:24 AM Jim Lemon <[hidden email]> wrote:
> >
> > Hi Ding,
> > Translating this into R code:
> >
> > freq<-c(1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)
> > > table(freq)
> > freq
> >  1   2   3   4   5   8  10  14 138 149
> >  8   3   1   3   2   3   1   1   1   1
> > > library(prettyR)
> > > Mode(freq)
> > [1] "1"
> >
> > You have a single modal value (1). If there were at most two ones, you
> > would have three values (2,4,8) that could be considered multiple
> > modes. What you seem to be doing is considering values that are not
> > separated by commas as modes. Perhaps this is a formatting problem
> > with your email.
> >
> > Jim
> >
> > On Mon, Mar 16, 2020 at 7:55 AM Yuan Chun Ding <[hidden email]> wrote:
> > >
> > > Hi R users,
> > >
> > > I want to find multiple modes (10, 8, 149) for the following vector.
> > >
> > > freq =1,2,5,5  10,4,4,8,1,1,8,8,2,4,3,1,2,1,1 138 149  14,1,1;
> > >
> > > any suggestion?
> > >
> > > Thank you,
> > >
> > > Ding
> > >
> > > ----------------------------------------------------------------------
> > > ------------------------------------------------------------
> > > -SECURITY/CONFIDENTIALITY WARNING-
> > >
> > > This message and any attachments are intended solely for the
> individual or entity to which they are addressed. This communication may
> contain information that is privileged, confidential, or exempt from
> disclosure under applicable law (e.g., personal health information,
> research data, financial information). Because this e-mail has been sent
> without encryption, individuals other than the intended recipient may be
> able to view the information, forward it to others or tamper with the
> information without the knowledge or consent of the sender. If you are not
> the intended recipient, or the employee or person responsible for
> delivering the message to the intended recipient, any dissemination,
> distribution or copying of the communication is strictly prohibited. If you
> received the communication in error, please notify the sender immediately
> by replying to this message and deleting the message and any accompanying
> files from your system. If, due to the security risks, you do not wish to r
> >  ec
> > >  eive further communications via e-mail, please reply to this message
> and inform the sender that you do not wish to receive further e-mail from
> the sender. (LCP301)
> > > ------------------------------------------------------------
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
> > > PLEASE do read the posting guide
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
> > PLEASE do read the posting guide
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Yuan Chun Ding


You **might** do better pursuing this sort of thing on the Bioconductor site:
https://www.bioconductor.org/help/<https://urldefense.com/v3/__https://www.bioconductor.org/help/__;!!Fou38LsQmgU!441uqddHFvpuq6wfAy-jNNUZ8Dz_jGxN9itKerhoPxav-yjaqUkpwPhN4bJJ$>
They often have professionally written R packages tailored for genomics so that you don't need to shake and bake your own with all the dangers that entails (not least of which may be that your methodology is suspect).

Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

Thank you, Bert!

I just realized that I made a typo in the following email, so I modified it using the red font.  we are doing genomics work, but this is a understudied genomic research, so no professional packages. I admit that what I am doing is pretty explorative.

On Sun, Mar 15, 2020 at 9:11 PM Yuan Chun Ding <[hidden email]<mailto:[hidden email]>> wrote:
sorry, I just came back.

Yes,  Abby's understanding is right.

> tem4$Var1
 [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18  20   21   22    23     24   25   31
> tem4$Freq
 [1]   1   2   5   5  10   4   4   8   1    1    8    8    2    4    3    1    2    1     1   138  149    14    1     1

I have 2000 markers, this is just one example marker, the var1 is a VNTR marker with alleles 1, 3, 4 etc, a multi-allele marker; the corresponding frequency for each allele is 1,2 5 etc.  I want to convert this multi-allele marker to bi-allele markers by choosing a cutoff value; I would want the cut point to be allele 6 with frequency of 10, so allele 1 to allele 5 are considered as "short" allele, allele 6 to 31 as "long" allele;  then sliding to next rsing frequency peak, allele 8 with frequency of 8, etc.

maybe those rising peaks are not really multiple modes, but I want to do this type of data conversion.  I want to first determine the number of modes, then convert input dat file into m different input files, then perform Cox regression analysis for each converted file. I am stuck in the step of find out m rise peaks.

Thank you,

Ding

   tem <- as.data.frame(t(dat[i,,drop=F]))
  names(tem)<-"V1"
  tem <- tem[which(tem$V1!=""),,drop=F]
  tem2 <-separate(tem, col=V1, into=c("m1","m2"), convert = T)
  tem3 <-gather(tem2, marker, VNTR_repeats, m1:m2)
  tem4 <-as.data.frame(t(t(table(tem3$VNTR_repeats))))[,c(1,3)]
  tem4$Var1 <-as.numeric(as.character(tem4$Var1))
  tem4 <-tem4[order(tem4$Var1),]
  m<-
________________________________________
From: Abby Spurdle [[hidden email]<mailto:[hidden email]>]
Sent: Sunday, March 15, 2020 3:42 PM
To: Jim Lemon
Cc: Yuan Chun Ding; r-help mailing list
Subject: Re: [R] find multiple mode

I think people have misinterpreted the question.
The OP wants local maxima from the series.

The original series is frequencies, so your table is frequencies of frequencies.

A solution can be derived by looking at signs of the first and second
differences.
But there may be a simpler way????

On Mon, Mar 16, 2020 at 10:24 AM Jim Lemon <[hidden email]<mailto:[hidden email]>> wrote:

>
> Hi Ding,
> Translating this into R code:
>
> freq<-c(1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)
> > table(freq)
> freq
>  1   2   3   4   5   8  10  14 138 149
>  8   3   1   3   2   3   1   1   1   1
> > library(prettyR)
> > Mode(freq)
> [1] "1"
>
> You have a single modal value (1). If there were at most two ones, you
> would have three values (2,4,8) that could be considered multiple
> modes. What you seem to be doing is considering values that are not
> separated by commas as modes. Perhaps this is a formatting problem
> with your email.
>
> Jim
>
> On Mon, Mar 16, 2020 at 7:55 AM Yuan Chun Ding <[hidden email]<mailto:[hidden email]>> wrote:
> >
> > Hi R users,
> >
> > I want to find multiple modes (10, 8, 149) for the following vector.
> >
> > freq =1,2,5,5  10,4,4,8,1,1,8,8,2,4,3,1,2,1,1 138 149  14,1,1;
> >
> > any suggestion?
> >
> > Thank you,
> >
> > Ding
> >
> > ----------------------------------------------------------------------
> > ------------------------------------------------------------
> > -SECURITY/CONFIDENTIALITY WARNING-
> >
> > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to
  r

>  ec
> >  eive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301)
> > ------------------------------------------------------------
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
> > PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
> PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help<https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!441uqddHFvpuq6wfAy-jNNUZ8Dz_jGxN9itKerhoPxav-yjaqUkpwHJvsqnO$>
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!441uqddHFvpuq6wfAy-jNNUZ8Dz_jGxN9itKerhoPxav-yjaqUkpwNnVwHmS$>
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Jeff Newmiller
Anyone following along on the mailing list cannot see your "red text" since this is a plain text mailing list.

Please set your email format to "plain text" when sending messages to the R mailing lists so you don't fool yourself into thinking we can see HTML formatting. It will also avoid corrupting your code examples... which does happen sometimes with formatted emails.

On March 15, 2020 10:00:25 PM PDT, Yuan Chun Ding <[hidden email]> wrote:

>
>
>You **might** do better pursuing this sort of thing on the Bioconductor
>site:
>https://www.bioconductor.org/help/<https://urldefense.com/v3/__https://www.bioconductor.org/help/__;!!Fou38LsQmgU!441uqddHFvpuq6wfAy-jNNUZ8Dz_jGxN9itKerhoPxav-yjaqUkpwPhN4bJJ$>
>They often have professionally written R packages tailored for genomics
>so that you don't need to shake and bake your own with all the dangers
>that entails (not least of which may be that your methodology is
>suspect).
>
>Bert Gunter
>
>"The trouble with having an open mind is that people keep coming along
>and sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>Thank you, Bert!
>
>I just realized that I made a typo in the following email, so I
>modified it using the red font.  we are doing genomics work, but this
>is a understudied genomic research, so no professional packages. I
>admit that what I am doing is pretty explorative.
>
>On Sun, Mar 15, 2020 at 9:11 PM Yuan Chun Ding
><[hidden email]<mailto:[hidden email]>> wrote:
>sorry, I just came back.
>
>Yes,  Abby's understanding is right.
>
>> tem4$Var1
>[1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17
>18  20   21   22    23     24   25   31
>> tem4$Freq
>[1]   1   2   5   5  10   4   4   8   1    1    8    8    2    4    3  
> 1    2    1     1   138  149    14    1     1
>
>I have 2000 markers, this is just one example marker, the var1 is a
>VNTR marker with alleles 1, 3, 4 etc, a multi-allele marker; the
>corresponding frequency for each allele is 1,2 5 etc.  I want to
>convert this multi-allele marker to bi-allele markers by choosing a
>cutoff value; I would want the cut point to be allele 6 with frequency
>of 10, so allele 1 to allele 5 are considered as "short" allele, allele
>6 to 31 as "long" allele;  then sliding to next rsing frequency peak,
>allele 8 with frequency of 8, etc.
>
>maybe those rising peaks are not really multiple modes, but I want to
>do this type of data conversion.  I want to first determine the number
>of modes, then convert input dat file into m different input files,
>then perform Cox regression analysis for each converted file. I am
>stuck in the step of find out m rise peaks.
>
>Thank you,
>
>Ding
>
>   tem <- as.data.frame(t(dat[i,,drop=F]))
>  names(tem)<-"V1"
>  tem <- tem[which(tem$V1!=""),,drop=F]
>  tem2 <-separate(tem, col=V1, into=c("m1","m2"), convert = T)
>  tem3 <-gather(tem2, marker, VNTR_repeats, m1:m2)
>  tem4 <-as.data.frame(t(t(table(tem3$VNTR_repeats))))[,c(1,3)]
>  tem4$Var1 <-as.numeric(as.character(tem4$Var1))
>  tem4 <-tem4[order(tem4$Var1),]
>  m<-
>________________________________________
>From: Abby Spurdle [[hidden email]<mailto:[hidden email]>]
>Sent: Sunday, March 15, 2020 3:42 PM
>To: Jim Lemon
>Cc: Yuan Chun Ding; r-help mailing list
>Subject: Re: [R] find multiple mode
>
>I think people have misinterpreted the question.
>The OP wants local maxima from the series.
>
>The original series is frequencies, so your table is frequencies of
>frequencies.
>
>A solution can be derived by looking at signs of the first and second
>differences.
>But there may be a simpler way????
>
>On Mon, Mar 16, 2020 at 10:24 AM Jim Lemon
><[hidden email]<mailto:[hidden email]>> wrote:
>>
>> Hi Ding,
>> Translating this into R code:
>>
>> freq<-c(1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)
>> > table(freq)
>> freq
>>  1   2   3   4   5   8  10  14 138 149
>>  8   3   1   3   2   3   1   1   1   1
>> > library(prettyR)
>> > Mode(freq)
>> [1] "1"
>>
>> You have a single modal value (1). If there were at most two ones,
>you
>> would have three values (2,4,8) that could be considered multiple
>> modes. What you seem to be doing is considering values that are not
>> separated by commas as modes. Perhaps this is a formatting problem
>> with your email.
>>
>> Jim
>>
>> On Mon, Mar 16, 2020 at 7:55 AM Yuan Chun Ding
><[hidden email]<mailto:[hidden email]>> wrote:
>> >
>> > Hi R users,
>> >
>> > I want to find multiple modes (10, 8, 149) for the following
>vector.
>> >
>> > freq =1,2,5,5  10,4,4,8,1,1,8,8,2,4,3,1,2,1,1 138 149  14,1,1;
>> >
>> > any suggestion?
>> >
>> > Thank you,
>> >
>> > Ding
>> >
>> >
>----------------------------------------------------------------------
>> > ------------------------------------------------------------
>> > -SECURITY/CONFIDENTIALITY WARNING-
>> >
>> > This message and any attachments are intended solely for the
>individual or entity to which they are addressed. This communication
>may contain information that is privileged, confidential, or exempt
>from disclosure under applicable law (e.g., personal health
>information, research data, financial information). Because this e-mail
>has been sent without encryption, individuals other than the intended
>recipient may be able to view the information, forward it to others or
>tamper with the information without the knowledge or consent of the
>sender. If you are not the intended recipient, or the employee or
>person responsible for delivering the message to the intended
>recipient, any dissemination, distribution or copying of the
>communication is strictly prohibited. If you received the communication
>in error, please notify the sender immediately by replying to this
>message and deleting the message and any accompanying files from your
>system. If, due to the security risks, you do not
>
>  wish to
>  r
>>  ec
>> >  eive further communications via e-mail, please reply to this
>message and inform the sender that you do not wish to receive further
>e-mail from the sender. (LCP301)
>> > ------------------------------------------------------------
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email]<mailto:[hidden email]> mailing list --
>To UNSUBSCRIBE and more, see
>> >
>https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
>> > PLEASE do read the posting guide
>https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email]<mailto:[hidden email]> mailing list -- To
>UNSUBSCRIBE and more, see
>>
>https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh7SE8vZj$
>> PLEASE do read the posting guide
>https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9OUyylfoBlBUQnLz-Egk5Um71rWB0DbbnY4gyx9_RpEt81y9sySrh07mJrDD$
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>[hidden email]<mailto:[hidden email]> mailing list -- To
>UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help<https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!441uqddHFvpuq6wfAy-jNNUZ8Dz_jGxN9itKerhoPxav-yjaqUkpwHJvsqnO$>
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html<https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!441uqddHFvpuq6wfAy-jNNUZ8Dz_jGxN9itKerhoPxav-yjaqUkpwNnVwHmS$>
>and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Jim Lemon-4
In reply to this post by Yuan Chun Ding
Hi Ding,
While I was completely off the track in my first reply, the subsequent
posts make your problem somewhat clearer. The way you state the
problem suggests that the order of the values of "freq" is important.
That is, it is not just a matter of finding local maxima, but the
direction in which you approach those maxima is important. For
example. I might want to only identify maxima with at least four
monotonically increasing values preceding them and a decrease of at
least half the value of the maximum in the succeeding value. By
breaking down the problem into a set of criteria, these can be
implemented in a function that will search the values in one
direction, returning the locations of maxima that fulfil those
criteria.

Jim

On Mon, Mar 16, 2020 at 3:11 PM Yuan Chun Ding <[hidden email]> wrote:

>
> sorry, I just came back.
>
> Yes,  Abby's understanding is right.
>
> > tem4$Var1
>  [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18  20   21   22    23     24   25   31
> > tem4$Freq
>  [1]   1   2   5   5  10   4   4   8   1    1    8    8    2    4    3    1    2    1     1   138  149    14    1     1
>
> I have 2000 markers, this is just one example marker, the var1 is a VNTR marker with alleles 1, 3, 4 etc, a multi-allele marker; the corresponding frequency for each allele is 1,2 5 etc.  I want to convert this multi-allele marker to bi-allele markers by choosing a cutoff value; I would want the cut point to be allele 6 with frequency of 10, so allele 1 to allele 9 are considered as "short" allele, allele 10 to 31 as "long" allele;  then sliding to next rsing frequency peak, allele 8 with frequency of 8, etc.
>
> maybe those rising peaks are not really multiple modes, but I want to do this type of data conversion.  I want to first determine the number of modes, then convert input dat file into m different input files, then perform Cox regression analysis for each converted file. I am stuck in the step of find out m rise peaks.
>
> Thank you,
>
> Ding
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Yuan Chun Ding
Hi Jim,

Yes, you are right.  I sorted the tem4$Var1 first, then find rising peaks in Freq variable from left to right.   I guess I probably need to define the minimal rising and drop on both side of a potential maxima, so avoid identifying really small rising peaks.  For example, I only want to identify the freq value of freq=10 (corresponding var1 = allele6), freq=8(var1=allele9),, freq=8( var1=allele12), and freq=149 (var1=allele23),  but ignore freq=4 (var10=allele15) and freq=2 (var1=allele18).

I am still working on it, any help would be really appreciated.

Thank you,

Ding

-----Original Message-----
From: Jim Lemon [mailto:[hidden email]]
Sent: Monday, March 16, 2020 1:10 AM
To: Yuan Chun Ding; r-help mailing list
Subject: Re: [R] find multiple mode, sorry for not providing enough information

[Attention: This email came from an external source. Do not open attachments or click on links from unknown senders or unexpected emails.]

----------------------------------------------------------------------
Hi Ding,
While I was completely off the track in my first reply, the subsequent posts make your problem somewhat clearer. The way you state the problem suggests that the order of the values of "freq" is important.
That is, it is not just a matter of finding local maxima, but the direction in which you approach those maxima is important. For example. I might want to only identify maxima with at least four monotonically increasing values preceding them and a decrease of at least half the value of the maximum in the succeeding value. By breaking down the problem into a set of criteria, these can be implemented in a function that will search the values in one direction, returning the locations of maxima that fulfil those criteria.

Jim

On Mon, Mar 16, 2020 at 3:11 PM Yuan Chun Ding <[hidden email]> wrote:

>
> sorry, I just came back.
>
> Yes,  Abby's understanding is right.
>
> > tem4$Var1
>  [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18  20   21   22    23     24   25   31
> > tem4$Freq
>  [1]   1   2   5   5  10   4   4   8   1    1    8    8     2     4    3    1    2    1     1   138  149    14    1     1
>
> I have 2000 markers, this is just one example marker, the var1 is a VNTR marker with alleles 1, 3, 4 etc, a multi-allele marker; the corresponding frequency for each allele is 1,2 5 etc.  I want to convert this multi-allele marker to bi-allele markers by choosing a cutoff value; I would want the cut point to be allele 6 with frequency of 10, so  patients with allele 1 to allele 5 are considered as carrying "short" allele, allele 6 to 31 as "long" allele;  then sliding to next rsing frequency peak, allele 8 with frequency of 8, etc.
>
> maybe those rising peaks are not really multiple modes, but I want to do this type of data conversion.  I want to first determine m number of modes, then convert input dat file into m different input files, then perform Cox regression analysis for each converted file. I am stuck in the step of find out m rise peaks.
>
> Thank you,
>
> Ding
>

----------------------------------------------------------------------
------------------------------------------------------------
-SECURITY/CONFIDENTIALITY WARNING-  

This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to rec
 eive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301)
------------------------------------------------------------
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Abby Spurdle
(Sorry, that was supposed to go to the mailing list).

Here's a solution to your original question:
---------
freq <- c (1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)

unique.consecutive <- function (x)
{       dx <- diff (x)
        x [dx != 0]
}

which.maxs <- function (x, ..., include.endpoints=FALSE)
{       dx <- diff (x)
        if (any (dx == 0) )
                stop ("function needs unique-consecutive values")
        ndx <- length (dx)
        I <- c (FALSE, dx [-ndx] > 0 & dx [-1] < 0, FALSE)
        if (include.endpoints)
        {       I [1] <- (dx [1] < 0)
                I [ndx + 1] <- (dx [ndx] > 0)
        }
        which (I)
}

freq.sub <- unique.consecutive (freq)
maxv <- freq.sub [which.maxs (freq.sub, include.endpoints=TRUE)]

maxv
unique (maxv)
---------

Some comments:

My package, probhat, contains early prototype-quality functions for
discrete kernel smoothing.
This can be used to "smooth" frequency data.
Which in turn, can eliminate spurious modes.

https://cran.r-project.org/web/packages/probhat/vignettes/probhat.pdf

Unfortunately, bandwidth selection is manual.
Also note that currently it only returns probability mass (not
frequency) but it's very easy to to get frequency from probability
mass.

I'm planning to resume work on this package in two to three days, so
I'm open to suggestions...

On 3/17/20, Yuan Chun Ding <[hidden email]> wrote:

> Hi Jim,
>
> Yes, you are right.  I sorted the tem4$Var1 first, then find rising peaks in
> Freq variable from left to right.   I guess I probably need to define the
> minimal rising and drop on both side of a potential maxima, so avoid
> identifying really small rising peaks.  For example, I only want to identify
> the freq value of freq=10 (corresponding var1 = allele6),
> freq=8(var1=allele9),, freq=8( var1=allele12), and freq=149 (var1=allele23),
>  but ignore freq=4 (var10=allele15) and freq=2 (var1=allele18).
>
> I am still working on it, any help would be really appreciated.
>
> Thank you,
>
> Ding
>
> -----Original Message-----
> From: Jim Lemon [mailto:[hidden email]]
> Sent: Monday, March 16, 2020 1:10 AM
> To: Yuan Chun Ding; r-help mailing list
> Subject: Re: [R] find multiple mode, sorry for not providing enough
> information
>
> [Attention: This email came from an external source. Do not open attachments
> or click on links from unknown senders or unexpected emails.]
>
> ----------------------------------------------------------------------
> Hi Ding,
> While I was completely off the track in my first reply, the subsequent posts
> make your problem somewhat clearer. The way you state the problem suggests
> that the order of the values of "freq" is important.
> That is, it is not just a matter of finding local maxima, but the direction
> in which you approach those maxima is important. For example. I might want
> to only identify maxima with at least four monotonically increasing values
> preceding them and a decrease of at least half the value of the maximum in
> the succeeding value. By breaking down the problem into a set of criteria,
> these can be implemented in a function that will search the values in one
> direction, returning the locations of maxima that fulfil those criteria.
>
> Jim
>
> On Mon, Mar 16, 2020 at 3:11 PM Yuan Chun Ding <[hidden email]> wrote:
>>
>> sorry, I just came back.
>>
>> Yes,  Abby's understanding is right.
>>
>> > tem4$Var1
>>  [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18
>> 20   21   22    23     24   25   31
>> > tem4$Freq
>>  [1]   1   2   5   5  10   4   4   8   1    1    8    8     2     4    3
>>  1    2    1     1   138  149    14    1     1
>>
>> I have 2000 markers, this is just one example marker, the var1 is a VNTR
>> marker with alleles 1, 3, 4 etc, a multi-allele marker; the corresponding
>> frequency for each allele is 1,2 5 etc.  I want to convert this
>> multi-allele marker to bi-allele markers by choosing a cutoff value; I
>> would want the cut point to be allele 6 with frequency of 10, so  patients
>> with allele 1 to allele 5 are considered as carrying "short" allele,
>> allele 6 to 31 as "long" allele;  then sliding to next rsing frequency
>> peak, allele 8 with frequency of 8, etc.
>>
>> maybe those rising peaks are not really multiple modes, but I want to do
>> this type of data conversion.  I want to first determine m number of
>> modes, then convert input dat file into m different input files, then
>> perform Cox regression analysis for each converted file. I am stuck in the
>> step of find out m rise peaks.
>>
>> Thank you,
>>
>> Ding
>>
>
> ----------------------------------------------------------------------
> ------------------------------------------------------------
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the individual or
> entity to which they are addressed. This communication may contain
> information that is privileged, confidential, or exempt from disclosure
> under applicable law (e.g., personal health information, research data,
> financial information). Because this e-mail has been sent without
> encryption, individuals other than the intended recipient may be able to
> view the information, forward it to others or tamper with the information
> without the knowledge or consent of the sender. If you are not the intended
> recipient, or the employee or person responsible for delivering the message
> to the intended recipient, any dissemination, distribution or copying of the
> communication is strictly prohibited. If you received the communication in
> error, please notify the sender immediately by replying to this message and
> deleting the message and any accompanying files from your system. If, due to
> the security risks, you do not wish to rec
>  eive further communications via e-mail, please reply to this message and
> inform the sender that you do not wish to receive further e-mail from the
> sender. (LCP301)
> ------------------------------------------------------------
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Abby Spurdle
In reply to this post by Yuan Chun Ding
Sorry, internet's not working properly today.
Third time lucky....

Here's a solution to your original question:

---------
freq <- c (1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)

unique.consecutive <- function (x)
{   dx <- diff (x)
    x [dx != 0]
}

which.maxs <- function (x, ..., include.endpoints=FALSE)
{   dx <- diff (x)
    if (any (dx == 0) )
    stop ("function needs unique-consecutive values")
    ndx <- length (dx)
    I <- c (FALSE, dx [-ndx] > 0 & dx [-1] < 0, FALSE)
    if (include.endpoints)
    {   I [1] <- (dx [1] < 0)
        I [ndx + 1] <- (dx [ndx] > 0)
    }
    which (I)
}

freq.sub <- unique.consecutive (freq)
maxv <- freq.sub [which.maxs (freq.sub, include.endpoints=TRUE)]

maxv
unique (maxv)
---------

Some comments:

My package, probhat, contains early prototype-quality functions for
discrete kernel smoothing.
This can be used to "smooth" frequency data.
Which in turn, can eliminate spurious modes.

https://cran.r-project.org/web/packages/probhat/vignettes/probhat.pdf

Unfortunately, bandwidth selection is manual.
Also note that currently it only returns probability mass (not frequency)
but it's very easy to to get frequency from probability mass.

I'm planning to resume work on this package in two to three days, so I'm
open to suggestions...

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Abby Spurdle
I think I need a different email.
Google is making it difficult to sent/receive/read completely plain
text messages.
On my end, it's automatically formatting plain text messages, and
doing so, incorrectly.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Yuan Chun Ding
In reply to this post by Abby Spurdle
Hi Abby,

Thank you so much for your effort!! I really appreciate your help!!

I modified your code a little to get both maxv and corresponding alleles in var1.

For now, I do not care too much for identifying small peaks.

Ding

var1 <-c(1,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21,22,23,24,25,31)
freq <-c(1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)
var_freq <-data.frame(var1,freq)

temp8 <-var_freq[diff(var_freq$freq)!=0, ]

which.maxs <- function (x, ..., include.endpoints=FALSE)
{       dx <- diff (x)
if (any (dx == 0) )
  stop ("function needs unique-consecutive values")
ndx <- length (dx)
I <- c (FALSE, dx [-ndx] > 0 & dx [-1] < 0, FALSE)
if (include.endpoints)
{       I [1] <- (dx [1] < 0)
I [ndx + 1] <- (dx [ndx] > 0)
}
which (I)
}

var_freq_modes <- temp8 [which.maxs (temp8$freq, include.endpoints=TRUE),]

-----Original Message-----
From: Abby Spurdle [mailto:[hidden email]]
Sent: Monday, March 16, 2020 1:57 PM
To: Yuan Chun Ding
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] find multiple mode, sorry for not providing enough information

(Sorry, that was supposed to go to the mailing list).

Here's a solution to your original question:
---------
freq <- c (1,2,5,5,10,4,4,8,1,1,8,8,2,4,3,1,2,1,1,138,149,14,1,1)

unique.consecutive <- function (x)
{       dx <- diff (x)
        x [dx != 0]
}

which.maxs <- function (x, ..., include.endpoints=FALSE)
{       dx <- diff (x)
        if (any (dx == 0) )
                stop ("function needs unique-consecutive values")
        ndx <- length (dx)
        I <- c (FALSE, dx [-ndx] > 0 & dx [-1] < 0, FALSE)
        if (include.endpoints)
        {       I [1] <- (dx [1] < 0)
                I [ndx + 1] <- (dx [ndx] > 0)
        }
        which (I)
}

freq.sub <- unique.consecutive (freq)
maxv <- freq.sub [which.maxs (freq.sub, include.endpoints=TRUE)]

maxv
unique (maxv)
---------

Some comments:

My package, probhat, contains early prototype-quality functions for discrete kernel smoothing.
This can be used to "smooth" frequency data.
Which in turn, can eliminate spurious modes.

https://urldefense.com/v3/__https://cran.r-project.org/web/packages/probhat/vignettes/probhat.pdf__;!!Fou38LsQmgU!9PTtUQp80JUeGi1gzKCC6IfoCZkZ8BO4ic42iXkMCcVlFUW4Cu1sgi1urJX0$ 

Unfortunately, bandwidth selection is manual.
Also note that currently it only returns probability mass (not
frequency) but it's very easy to to get frequency from probability mass.

I'm planning to resume work on this package in two to three days, so I'm open to suggestions...

On 3/17/20, Yuan Chun Ding <[hidden email]> wrote:

> Hi Jim,
>
> Yes, you are right.  I sorted the tem4$Var1 first, then find rising peaks in
> Freq variable from left to right.   I guess I probably need to define the
> minimal rising and drop on both side of a potential maxima, so avoid
> identifying really small rising peaks.  For example, I only want to
> identify the freq value of freq=10 (corresponding var1 = allele6),
> freq=8(var1=allele9),, freq=8( var1=allele12), and freq=149
> (var1=allele23),  but ignore freq=4 (var10=allele15) and freq=2 (var1=allele18).
>
> I am still working on it, any help would be really appreciated.
>
> Thank you,
>
> Ding
>
> -----Original Message-----
> From: Jim Lemon [mailto:[hidden email]]
> Sent: Monday, March 16, 2020 1:10 AM
> To: Yuan Chun Ding; r-help mailing list
> Subject: Re: [R] find multiple mode, sorry for not providing enough
> information
>
> [Attention: This email came from an external source. Do not open
> attachments or click on links from unknown senders or unexpected
> emails.]
>
> ----------------------------------------------------------------------
> Hi Ding,
> While I was completely off the track in my first reply, the subsequent
> posts make your problem somewhat clearer. The way you state the
> problem suggests that the order of the values of "freq" is important.
> That is, it is not just a matter of finding local maxima, but the
> direction in which you approach those maxima is important. For
> example. I might want to only identify maxima with at least four
> monotonically increasing values preceding them and a decrease of at
> least half the value of the maximum in the succeeding value. By
> breaking down the problem into a set of criteria, these can be
> implemented in a function that will search the values in one direction, returning the locations of maxima that fulfil those criteria.
>
> Jim
>
> On Mon, Mar 16, 2020 at 3:11 PM Yuan Chun Ding <[hidden email]> wrote:
>>
>> sorry, I just came back.
>>
>> Yes,  Abby's understanding is right.
>>
>> > tem4$Var1
>>  [1]  1    3   4   5   6    7   8   9  10  11  12  13  14  15  16  17  18
>> 20   21   22    23     24   25   31
>> > tem4$Freq
>>  [1]   1   2   5   5  10   4   4   8   1    1    8    8     2     4    3
>>  1    2    1     1   138  149    14    1     1
>>
>> I have 2000 markers, this is just one example marker, the var1 is a
>> VNTR marker with alleles 1, 3, 4 etc, a multi-allele marker; the
>> corresponding frequency for each allele is 1,2 5 etc.  I want to
>> convert this multi-allele marker to bi-allele markers by choosing a
>> cutoff value; I would want the cut point to be allele 6 with
>> frequency of 10, so  patients with allele 1 to allele 5 are
>> considered as carrying "short" allele, allele 6 to 31 as "long"
>> allele;  then sliding to next rsing frequency peak, allele 8 with frequency of 8, etc.
>>
>> maybe those rising peaks are not really multiple modes, but I want to
>> do this type of data conversion.  I want to first determine m number
>> of modes, then convert input dat file into m different input files,
>> then perform Cox regression analysis for each converted file. I am
>> stuck in the step of find out m rise peaks.
>>
>> Thank you,
>>
>> Ding
>>
>
> ----------------------------------------------------------------------
> ------------------------------------------------------------
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the
> individual or entity to which they are addressed. This communication
> may contain information that is privileged, confidential, or exempt
> from disclosure under applicable law (e.g., personal health
> information, research data, financial information). Because this
> e-mail has been sent without encryption, individuals other than the
> intended recipient may be able to view the information, forward it to
> others or tamper with the information without the knowledge or consent
> of the sender. If you are not the intended recipient, or the employee
> or person responsible for delivering the message to the intended
> recipient, any dissemination, distribution or copying of the
> communication is strictly prohibited. If you received the
> communication in error, please notify the sender immediately by
> replying to this message and deleting the message and any accompanying
> files from your system. If, due to the security risks, you do not wish
> to rec  eive further communications via e-mail, please reply to this
> message and inform the sender that you do not wish to receive further
> e-mail from the sender. (LCP301)
> ------------------------------------------------------------
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-he
> lp__;!!Fou38LsQmgU!9PTtUQp80JUeGi1gzKCC6IfoCZkZ8BO4ic42iXkMCcVlFUW4Cu1
> sgoXl9dl3$
> PLEASE do read the posting guide
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.htm
> l__;!!Fou38LsQmgU!9PTtUQp80JUeGi1gzKCC6IfoCZkZ8BO4ic42iXkMCcVlFUW4Cu1s
> glbVg45-$ and provide commented, minimal, self-contained, reproducible
> code.
>
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: find multiple mode, sorry for not providing enough information

Abby Spurdle
#bug fix
unique.consecutive <- function (x)
{ dx <- diff (x)
        x [c (TRUE, dx != 0)]
}

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.