Quantcast

Data read as labels

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Data read as labels

barb
This post was updated on .
Hey guys,

i have a strange problem reading a .csv file.
Seems not to be covered by the usual read.csv techniques.

The relevant data i want to use, seems to be saved as the label of the data point.
Therefore i can not really use it


spec<-"EU2001"
part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_"
part2<-"&csvformat=de&euro=mixed&tr="
tmp<-tempfile()
load<-paste(part1,spec,part2,spec,sep="")
download.file(load,tmp)
file<-read.csv(tmp,sep=";",dec=",", skip="5")
(relevant<-file[,2][1])



Thanks a lot for your help and your time!
Regards
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

Krzysztof Mitko


On Mon, May 14, 2012, at 02:33, barb wrote:

> Hey guys,
>
> i have a strange problem reading a .csv file.
> Seems not to be covered by the usual read.csv techniques.
>
> The relevant data i want to use, seems to be saved as the label of the
> data
> point.
> Therefore i can not really use it
>
>
> spec<-"EU2001"
> part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_"
> part2<-"&csvformat=de&euro=mixed&tr="
> tmp<-tempfile()
> load<-paste(part1,spec,part2,spec,sep="")
> download.file(load,tmp)
> file<-read.csv(tmp,sep=";",dec=",", skip="5")
> (relevant<-file[,2][1])

It seems to me that there is a problem with conversion from data to
known type - the last two lines contains comments instead of data and
first column type is not recognized. You can supress all conversions,
remove problematic lines and then make conversion manually or import
only relevant lines and specify types. For example:

file<-read.csv(tmp, sep=";",
dec=",",skip=5,header=FALSE,nrows=495,colClasses=c("character","numeric","NULL","NULL"))

--
Z pozdrowieniami,
Krzysztof Mitko

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

David Winsemius
In reply to this post by barb

On May 14, 2012, at 5:33 AM, barb wrote:

> Hey guys,
>
> i have a strange problem reading a .csv file.
> Seems not to be covered by the usual read.csv techniques.
>
> The relevant data i want to use, seems to be saved as the label of  
> the data
> point.
> Therefore i can not really use it
>
>
> spec<-"EU2001"
> part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_ 
> "
> part2<-"&csvformat=de&euro=mixed&tr="
> tmp<-tempfile()
> load<-paste(part1,spec,part2,spec,sep="")
> download.file(load,tmp)
> file<-read.csv(tmp,sep=";",dec=",", skip="5")
> (relevant<-file[,2][1])
>

If dec="," then you probably need read.csv2()

(Since dec="," is the default I would remove that argument from the  
call. It seemed to succeed )

file<-read.csv2(tmp,sep=";", skip="5")
(relevant<-file[,2][1])
[1] 10716,05
496 Levels: 10323,52 10391,38 10716,05 10929,62 11051,23 11329,50  
11380,11 ... Methodik: Ab Januar 1993 einschl. der Zuschätzungen für  
nichtmelde- pflichtigen Außenhandel, die bis Dezember 1992 in den  
Ergänzungen zum Außenhandel enthalten sind.


--
David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

barb
Hey David,

thanks for your fast reply, i really appreciate that you answer so many posts.

Unfortunately it´s not that easy. Try to operate with the output:

e.g
file<-read.csv2(tmp,sep=";",skip="5")
a<-(relevant<-file[,2][1])
a*5
# or
as.numeric(relevant<-file[,2][1])

a is saved in the workspace as a factor and the values i actually need are saved as the labels.
(therefore my subject)

Thank You!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

David Winsemius

On May 14, 2012, at 11:23 AM, barb wrote:

> Hey David,
>
> thanks for your fast reply, i really appreciate that you answer so  
> many
> posts.
>
> Unfortunately it´s not that easy. Try to operate with the output:
>
> e.g
> file<-read.csv2(tmp,sep=";",skip="5")
> a<-(relevant<-file[,2][1])
> a*5
> # or
> as.numeric(relevant<-file[,2][1])
>
> a is saved in the workspace as a factor and the values i actually  
> need are
> saved as the labels.
> (therefore my subject)

Your subject line asked for "labels". That is not a word that  
represents anything specific in R parlance except perhaps plotting  
function arguments. It you want to prevent the conversion of  
"character" values to factors then you should be using  
stringsAsFactors=FALSE in the read functions.

If you want to convert from factor to character correctly, you could  
also refer to the FAQ. On my machine the section "7.10 How do I  
convert factors to numeric?" is located at:

http://127.0.0.1:13702/doc/manual/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f

You should have a similar copy of the FAQ someplace on your machine.  
It's good to review the "miscellaneous" section a couple of times.

>
> Thank You!
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Data-read-as-labels-tp4629901p4629951.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

barb
This post was updated on .
Hey David,

i tried all this - it doesn´t work :(

file<-read.csv2(tmp,sep=";",skip="5") # or file<-read.csv2(tmp,sep=";",skip="5",stringsAsFactors=FALSE)
a<-(relevant<-file[,2])
clean <- as.numeric(levels(a))[as.integer(a)]
clean<-as.numeric(as.character(a))


i often use noquote and strsplit and then convert data, but i never dealed with that kind of data
and it drives me crazy =)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

Rui Barradas
In reply to this post by barb
Hello,

Your data.frame has some noise in the last two rows.
See if this works.


#------------------- this is your code ---------------
spec <- "EU2001"
part1 <-
"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_"
part2 <- "&csvformat=de&euro=mixed&tr="
tmp <- tempfile()
load <- paste(part1, spec, part2, spec, sep="")
download.file(load,tmp)

# read it in, no conversion from strings to factors
file <- read.csv(tmp, sep=";", dec=",", skip="5", stringsAsFactors=FALSE)
# see it
str(file)
head(file)
tail(file)  # -----> problem
# last two rows are messed up
nr <- nrow(file)
# see it without them
tail(file[ -c(nr - 1, nr), ])

# remove the last two rows
fl <- file[ -c(nr - 1, nr), ]
(relevant <- fl[, 2])

Also, 'file' is the name of an R function, use something else, it can be
confusing.

Hope this helps,

Rui Barradas



Em 15-05-2012 11:00, [hidden email] escreveu:

> Date: Mon, 14 May 2012 02:33:54 -0700 (PDT)
> From: barb<[hidden email]>
> To:[hidden email]
> Subject: [R] Data read as labels
> Message-ID:<[hidden email]>
> Content-Type: text/plain; charset=us-ascii
>
> Hey guys,
>
> i have a strange problem reading a .csv file.
> Seems not to be covered by the usual read.csv techniques.
>
> The relevant data i want to use, seems to be saved as the label of the data
> point.
> Therefore i can not really use it
>
>
> spec<-"EU2001"
> part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_"
> part2<-"&csvformat=de&euro=mixed&tr="
> tmp<-tempfile()
> load<-paste(part1,spec,part2,spec,sep="")
> download.file(load,tmp)
> file<-read.csv(tmp,sep=";",dec=",", skip="5")
> (relevant<-file[,2][1])

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

David Winsemius
In reply to this post by barb

On May 15, 2012, at 11:13 AM, barb wrote:

> Hey David,
>
> i tried all this - it doesn´t work :(

Sadder and less informative words were never written!

Learn to express in natural language or in code what you were  
expecting rather than use the phrase "doesn't work" which can mean one  
of an almost infinite sources of programming failure.

>
> file<-read.csv2(tmp,sep=";",skip="5") # or
> file<-read.csv2(tmp,sep=";",skip="5",stringsAsFactors=FALSE)a<-
> (relevant<-file[,2])
> clean <- as.numeric(levels(a))[as.integer(a)]
> clean<-as.numeric(as.character(a))
>

When I do that, I get a series of comma-separated digit values inside  
strings (because the default setting for the decimal separator is  
period = Punct auf Deutsch if I remember my lessons from 40 years ago,  
"." and not Komma ) so at the end I get:

....
[490] "94801,00"
[491] "85013,00"
[492] "85982,00"
[493] "91213,00"
[494] "98912,00"
[495] "Bemerkung: "
[496] "Methodik: Ab Januar 1993 einschl. der Zuschätzungen für  
nichtmelde- pflichtigen Außenhandel, die bis Dezember 1992 in den  
Ergänzungen zum Außenhandel enthalten sind."

So what is "not working" and what would be "success"? Is this  
successful?

 > aconv <- sub("\\,", ".", a)
 > str(as.numeric(aconv))
  num [1:496] 10716 12897 11330 10930 11485 ...
Warning message:
In str(as.numeric(aconv)) : NAs introduced by coercion
 > as.numeric(aconv[480:494])
  [1] 78645.63 84067.37 98180.37 84252.25 92003.35 88139.63 85664.93  
85138.00 94960.00
[10] 89170.00 94801.00 85013.00 85982.00 91213.00 98912.00



>
> i often use noquote and strsplit and then convert data, but i never  
> dealed
> with that kind of data
> and it drives me crazy =)
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Data-read-as-labels-tp4629901p4630112.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Data read as labels

barb
that´s a success. "Das ist ein Erfolg". :) Maybe you had that in your german lessons, too ;)
So if you are still interested in learning our language, feel free to ask me.
I really learned a lot in this forum, so i am always happy to be able to give something back
Loading...