Beginner needs help with R

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Beginner needs help with R

Nabila Arbi
Dear R-Help Team!

I have some trouble with R. It's probably nothing big, but I can't find a
solution.
My problem is the following:
I am trying to download some sequences from ncbi using the ape package.

seq1 <- paste("DQ", seq(060054, 060060), sep = "")

sequences <- read.GenBank(seq1,
seq.names = seq1,
species.names = TRUE,
gene.names = FALSE,
as.character = TRUE)

write.dna(sequences, "mysequences.fas", format = "fasta")

My problem is, that R doesn't take the whole sequence number as "060054"
but it puts it as DQ60054 (missing the zero in the beginning, which is
essential).

Could please tell me, how I can get R to accepting the zero in the
beginning of the accession number?

Thank you very much in advance and all the best!

Nabila

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

Adams, Jean
Try this:

seq1 <- paste("DQ0", seq(60054, 60060), sep = "")

Jean

On Sun, Feb 5, 2017 at 7:50 PM, Nabila Arbi <[hidden email]>
wrote:

> Dear R-Help Team!
>
> I have some trouble with R. It's probably nothing big, but I can't find a
> solution.
> My problem is the following:
> I am trying to download some sequences from ncbi using the ape package.
>
> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>
> sequences <- read.GenBank(seq1,
> seq.names = seq1,
> species.names = TRUE,
> gene.names = FALSE,
> as.character = TRUE)
>
> write.dna(sequences, "mysequences.fas", format = "fasta")
>
> My problem is, that R doesn't take the whole sequence number as "060054"
> but it puts it as DQ60054 (missing the zero in the beginning, which is
> essential).
>
> Could please tell me, how I can get R to accepting the zero in the
> beginning of the accession number?
>
> Thank you very much in advance and all the best!
>
> Nabila
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

Ivan Calandra-5
In reply to this post by Nabila Arbi
Hi Nabila,

This is because you ask to create a sequence with seq(), which does not
make much sense with non numeric data. That's why R trims the 0.

One alternative would be:
seq2 <- paste("DQ0", seq(60054, 60060), sep = "")

Would that work for you?

HTH,
Ivan

--
Ivan Calandra, PhD
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
[hidden email]
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra
https://rgzm.academia.edu/IvanCalandra
https://publons.com/author/705639/

On 06/02/2017 02:50, Nabila Arbi wrote:

> Dear R-Help Team!
>
> I have some trouble with R. It's probably nothing big, but I can't find a
> solution.
> My problem is the following:
> I am trying to download some sequences from ncbi using the ape package.
>
> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>
> sequences <- read.GenBank(seq1,
> seq.names = seq1,
> species.names = TRUE,
> gene.names = FALSE,
> as.character = TRUE)
>
> write.dna(sequences, "mysequences.fas", format = "fasta")
>
> My problem is, that R doesn't take the whole sequence number as "060054"
> but it puts it as DQ60054 (missing the zero in the beginning, which is
> essential).
>
> Could please tell me, how I can get R to accepting the zero in the
> beginning of the accession number?
>
> Thank you very much in advance and all the best!
>
> Nabila
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

jholtman
In reply to this post by Nabila Arbi
You need the leading zeros, and 'numerics' just give the number without
leading zeros.  You can use 'sprintf' for create a character string with
the leading zeros:

> # this is using 'numeric' and drops leading zeros
>
> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
> seq1
[1] "DQ60054" "DQ60055" "DQ60056" "DQ60057" "DQ60058" "DQ60059" "DQ60060"
>
> # use 'sprintf' to create leading zeros
> seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))
> seq2
[1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
"DQ060060"
>


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sun, Feb 5, 2017 at 8:50 PM, Nabila Arbi <[hidden email]>
wrote:

> Dear R-Help Team!
>
> I have some trouble with R. It's probably nothing big, but I can't find a
> solution.
> My problem is the following:
> I am trying to download some sequences from ncbi using the ape package.
>
> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>
> sequences <- read.GenBank(seq1,
> seq.names = seq1,
> species.names = TRUE,
> gene.names = FALSE,
> as.character = TRUE)
>
> write.dna(sequences, "mysequences.fas", format = "fasta")
>
> My problem is, that R doesn't take the whole sequence number as "060054"
> but it puts it as DQ60054 (missing the zero in the beginning, which is
> essential).
>
> Could please tell me, how I can get R to accepting the zero in the
> beginning of the accession number?
>
> Thank you very much in advance and all the best!
>
> Nabila
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

Adrian Dușa
In reply to this post by Nabila Arbi
Two methods, among others:

seq1 <- paste("DQ", sprintf("%0*d", 6, seq(060054, 060060)), sep = "")

or

seq1 <- paste("DQ", formatC(seq(060054, 060060), dig = 5, flag = 0), sep =
"")

Hth,
Adrian


On Mon, Feb 6, 2017 at 3:50 AM, Nabila Arbi <[hidden email]>
wrote:

> Dear R-Help Team!
>
> I have some trouble with R. It's probably nothing big, but I can't find a
> solution.
> My problem is the following:
> I am trying to download some sequences from ncbi using the ape package.
>
> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>
> sequences <- read.GenBank(seq1,
> seq.names = seq1,
> species.names = TRUE,
> gene.names = FALSE,
> as.character = TRUE)
>
> write.dna(sequences, "mysequences.fas", format = "fasta")
>
> My problem is, that R doesn't take the whole sequence number as "060054"
> but it puts it as DQ60054 (missing the zero in the beginning, which is
> essential).
>
> Could please tell me, how I can get R to accepting the zero in the
> beginning of the accession number?
>
> Thank you very much in advance and all the best!
>
> Nabila
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr. 90-92
050663 Bucharest sector 5
Romania

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

Jeff Newmiller
In reply to this post by jholtman
I think it is important to point out that whenever R treats a number as a numeric (integer or double) it loses any base 10 concept of "leading zero" in that internal representation, so in this expression

seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))

the arguments to seq have leading zeros that are ignored by R and have nothing to do with getting the desired output. That is,  the same result can be obtained using

seq2 <- paste0("DQ", sprintf("%06d", seq(60054, 60060)))

or

seq2 <- paste0("DQ", sprintf("%06d", seq(0060054, 00060060)))

since only the zero inside the format string is key to success. (If it makes you more comfortable to put the zero there for readability that is your choice, but R ignores therm.)

Also note that the paste0 function is not needed when you use sprintf:

seq2 <- sprintf("DQ%06d", seq(60054, 60060))

or

myprefix <- "DQ"
seq2 <- sprintf("%s%06d", myprefix,seq(60054, 60060))

--
Sent from my phone. Please excuse my brevity.

On February 6, 2017 5:45:43 AM PST, jim holtman <[hidden email]> wrote:

>You need the leading zeros, and 'numerics' just give the number without
>leading zeros.  You can use 'sprintf' for create a character string
>with
>the leading zeros:
>
>> # this is using 'numeric' and drops leading zeros
>>
>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>> seq1
>[1] "DQ60054" "DQ60055" "DQ60056" "DQ60057" "DQ60058" "DQ60059"
>"DQ60060"
>>
>> # use 'sprintf' to create leading zeros
>> seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))
>> seq2
>[1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
>"DQ060060"
>>
>
>
>Jim Holtman
>Data Munger Guru
>
>What is the problem that you are trying to solve?
>Tell me what you want to do, not how you want to do it.
>
>On Sun, Feb 5, 2017 at 8:50 PM, Nabila Arbi
><[hidden email]>
>wrote:
>
>> Dear R-Help Team!
>>
>> I have some trouble with R. It's probably nothing big, but I can't
>find a
>> solution.
>> My problem is the following:
>> I am trying to download some sequences from ncbi using the ape
>package.
>>
>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>
>> sequences <- read.GenBank(seq1,
>> seq.names = seq1,
>> species.names = TRUE,
>> gene.names = FALSE,
>> as.character = TRUE)
>>
>> write.dna(sequences, "mysequences.fas", format = "fasta")
>>
>> My problem is, that R doesn't take the whole sequence number as
>"060054"
>> but it puts it as DQ60054 (missing the zero in the beginning, which
>is
>> essential).
>>
>> Could please tell me, how I can get R to accepting the zero in the
>> beginning of the accession number?
>>
>> Thank you very much in advance and all the best!
>>
>> Nabila
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

David Winsemius

> On Feb 6, 2017, at 9:08 AM, Jeff Newmiller <[hidden email]> wrote:
>
> I think it is important to point out that whenever R treats a number as a numeric (integer or double) it loses any base 10 concept of "leading zero" in that internal representation, so in this expression
>
> seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))

You can do it just with sprintf (and presumably with formatC as well) if you add a leading "DQ" to the format string.:

> sprintf( "DQ%06s", seq(060054, 060060))
[1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059" "DQ060060"

--
David.



> the arguments to seq have leading zeros that are ignored by R and have nothing to do with getting the desired output. That is,  the same result can be obtained using
>
> seq2 <- paste0("DQ", sprintf("%06d", seq(60054, 60060)))
>
> or
>
> seq2 <- paste0("DQ", sprintf("%06d", seq(0060054, 00060060)))
>
> since only the zero inside the format string is key to success. (If it makes you more comfortable to put the zero there for readability that is your choice, but R ignores therm.)
>
> Also note that the paste0 function is not needed when you use sprintf:
>
> seq2 <- sprintf("DQ%06d", seq(60054, 60060))
>
> or
>
> myprefix <- "DQ"
> seq2 <- sprintf("%s%06d", myprefix,seq(60054, 60060))
>
> --
> Sent from my phone. Please excuse my brevity.
>
> On February 6, 2017 5:45:43 AM PST, jim holtman <[hidden email]> wrote:
>> You need the leading zeros, and 'numerics' just give the number without
>> leading zeros.  You can use 'sprintf' for create a character string
>> with
>> the leading zeros:
>>
>>> # this is using 'numeric' and drops leading zeros
>>>
>>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>> seq1
>> [1] "DQ60054" "DQ60055" "DQ60056" "DQ60057" "DQ60058" "DQ60059"
>> "DQ60060"
>>>
>>> # use 'sprintf' to create leading zeros
>>> seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))
>>> seq2
>> [1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
>> "DQ060060"
>>>
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> On Sun, Feb 5, 2017 at 8:50 PM, Nabila Arbi
>> <[hidden email]>
>> wrote:
>>
>>> Dear R-Help Team!
>>>
>>> I have some trouble with R. It's probably nothing big, but I can't
>> find a
>>> solution.
>>> My problem is the following:
>>> I am trying to download some sequences from ncbi using the ape
>> package.
>>>
>>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>>
>>> sequences <- read.GenBank(seq1,
>>> seq.names = seq1,
>>> species.names = TRUE,
>>> gene.names = FALSE,
>>> as.character = TRUE)
>>>
>>> write.dna(sequences, "mysequences.fas", format = "fasta")
>>>
>>> My problem is, that R doesn't take the whole sequence number as
>> "060054"
>>> but it puts it as DQ60054 (missing the zero in the beginning, which
>> is
>>> essential).
>>>
>>> Could please tell me, how I can get R to accepting the zero in the
>>> beginning of the accession number?
>>>
>>> Thank you very much in advance and all the best!
>>>
>>> Nabila
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

Bert Gunter-2
In reply to this post by jholtman
No need for sprintf(). Simply:

> paste0("DQ0",seq.int(60054,60060))

[1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
[7] "DQ060060"


Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Feb 6, 2017 at 5:45 AM, jim holtman <[hidden email]> wrote:

> You need the leading zeros, and 'numerics' just give the number without
> leading zeros.  You can use 'sprintf' for create a character string with
> the leading zeros:
>
>> # this is using 'numeric' and drops leading zeros
>>
>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>> seq1
> [1] "DQ60054" "DQ60055" "DQ60056" "DQ60057" "DQ60058" "DQ60059" "DQ60060"
>>
>> # use 'sprintf' to create leading zeros
>> seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))
>> seq2
> [1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
> "DQ060060"
>>
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Sun, Feb 5, 2017 at 8:50 PM, Nabila Arbi <[hidden email]>
> wrote:
>
>> Dear R-Help Team!
>>
>> I have some trouble with R. It's probably nothing big, but I can't find a
>> solution.
>> My problem is the following:
>> I am trying to download some sequences from ncbi using the ape package.
>>
>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>
>> sequences <- read.GenBank(seq1,
>> seq.names = seq1,
>> species.names = TRUE,
>> gene.names = FALSE,
>> as.character = TRUE)
>>
>> write.dna(sequences, "mysequences.fas", format = "fasta")
>>
>> My problem is, that R doesn't take the whole sequence number as "060054"
>> but it puts it as DQ60054 (missing the zero in the beginning, which is
>> essential).
>>
>> Could please tell me, how I can get R to accepting the zero in the
>> beginning of the accession number?
>>
>> Thank you very much in advance and all the best!
>>
>> Nabila
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

Ted Harding
Bert, your solution seems to presuppose that the programmer
knows beforehand that the leading digit in the number is "0"
(which in fact is clearly the case in Nabila Arbi's original
query). However, the sequence might arise from some process
outside of the progammer's contgrol, and may then either have
a leading 0 or not.In that case, I think Jim's solution is safer!
Best wishes,
Ted.


On 07-Feb-2017 16:02:18 Bert Gunter wrote:

> No need for sprintf(). Simply:
>
>> paste0("DQ0",seq.int(60054,60060))
>
> [1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
> [7] "DQ060060"
>
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Feb 6, 2017 at 5:45 AM, jim holtman <[hidden email]> wrote:
>> You need the leading zeros, and 'numerics' just give the number without
>> leading zeros.  You can use 'sprintf' for create a character string with
>> the leading zeros:
>>
>>> # this is using 'numeric' and drops leading zeros
>>>
>>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>> seq1
>> [1] "DQ60054" "DQ60055" "DQ60056" "DQ60057" "DQ60058" "DQ60059" "DQ60060"
>>>
>>> # use 'sprintf' to create leading zeros
>>> seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))
>>> seq2
>> [1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
>> "DQ060060"
>>>
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> On Sun, Feb 5, 2017 at 8:50 PM, Nabila Arbi <[hidden email]>
>> wrote:
>>
>>> Dear R-Help Team!
>>>
>>> I have some trouble with R. It's probably nothing big, but I can't find a
>>> solution.
>>> My problem is the following:
>>> I am trying to download some sequences from ncbi using the ape package.
>>>
>>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>>
>>> sequences <- read.GenBank(seq1,
>>> seq.names = seq1,
>>> species.names = TRUE,
>>> gene.names = FALSE,
>>> as.character = TRUE)
>>>
>>> write.dna(sequences, "mysequences.fas", format = "fasta")
>>>
>>> My problem is, that R doesn't take the whole sequence number as "060054"
>>> but it puts it as DQ60054 (missing the zero in the beginning, which is
>>> essential).
>>>
>>> Could please tell me, how I can get R to accepting the zero in the
>>> beginning of the accession number?
>>>
>>> Thank you very much in advance and all the best!
>>>
>>> Nabila
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Date: 07-Feb-2017  Time: 16:48:41
This message was sent by XFMail

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Beginner needs help with R

Bert Gunter-2
Yes, I was replying to the OP's query **as stated.** I try to avoid
guessing what the OP really *meant*, although I grant that sometimes
this may be necessary.

But do note that the leading 0's in seq() *are* unnecessary:

> sprintf("%02d",1:3)
[1] "01" "02" "03"


Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Feb 7, 2017 at 8:48 AM, Ted Harding <[hidden email]> wrote:

> Bert, your solution seems to presuppose that the programmer
> knows beforehand that the leading digit in the number is "0"
> (which in fact is clearly the case in Nabila Arbi's original
> query). However, the sequence might arise from some process
> outside of the progammer's contgrol, and may then either have
> a leading 0 or not.In that case, I think Jim's solution is safer!
> Best wishes,
> Ted.
>
>
> On 07-Feb-2017 16:02:18 Bert Gunter wrote:
>> No need for sprintf(). Simply:
>>
>>> paste0("DQ0",seq.int(60054,60060))
>>
>> [1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
>> [7] "DQ060060"
>>
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Mon, Feb 6, 2017 at 5:45 AM, jim holtman <[hidden email]> wrote:
>>> You need the leading zeros, and 'numerics' just give the number without
>>> leading zeros.  You can use 'sprintf' for create a character string with
>>> the leading zeros:
>>>
>>>> # this is using 'numeric' and drops leading zeros
>>>>
>>>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>>> seq1
>>> [1] "DQ60054" "DQ60055" "DQ60056" "DQ60057" "DQ60058" "DQ60059" "DQ60060"
>>>>
>>>> # use 'sprintf' to create leading zeros
>>>> seq2 <- paste0("DQ", sprintf("%06d", seq(060054, 060060)))
>>>> seq2
>>> [1] "DQ060054" "DQ060055" "DQ060056" "DQ060057" "DQ060058" "DQ060059"
>>> "DQ060060"
>>>>
>>>
>>>
>>> Jim Holtman
>>> Data Munger Guru
>>>
>>> What is the problem that you are trying to solve?
>>> Tell me what you want to do, not how you want to do it.
>>>
>>> On Sun, Feb 5, 2017 at 8:50 PM, Nabila Arbi <[hidden email]>
>>> wrote:
>>>
>>>> Dear R-Help Team!
>>>>
>>>> I have some trouble with R. It's probably nothing big, but I can't find a
>>>> solution.
>>>> My problem is the following:
>>>> I am trying to download some sequences from ncbi using the ape package.
>>>>
>>>> seq1 <- paste("DQ", seq(060054, 060060), sep = "")
>>>>
>>>> sequences <- read.GenBank(seq1,
>>>> seq.names = seq1,
>>>> species.names = TRUE,
>>>> gene.names = FALSE,
>>>> as.character = TRUE)
>>>>
>>>> write.dna(sequences, "mysequences.fas", format = "fasta")
>>>>
>>>> My problem is, that R doesn't take the whole sequence number as "060054"
>>>> but it puts it as DQ60054 (missing the zero in the beginning, which is
>>>> essential).
>>>>
>>>> Could please tell me, how I can get R to accepting the zero in the
>>>> beginning of the accession number?
>>>>
>>>> Thank you very much in advance and all the best!
>>>>
>>>> Nabila
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> -------------------------------------------------
> E-Mail: (Ted Harding) <[hidden email]>
> Date: 07-Feb-2017  Time: 16:48:41
> This message was sent by XFMail
> -------------------------------------------------

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.