|
This post was updated on .
Hey guys,
i have a strange problem reading a .csv file. Seems not to be covered by the usual read.csv techniques. The relevant data i want to use, seems to be saved as the label of the data point. Therefore i can not really use it spec<-"EU2001" part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_" part2<-"&csvformat=de&euro=mixed&tr=" tmp<-tempfile() load<-paste(part1,spec,part2,spec,sep="") download.file(load,tmp) file<-read.csv(tmp,sep=";",dec=",", skip="5") (relevant<-file[,2][1]) Thanks a lot for your help and your time! Regards |
|
On Mon, May 14, 2012, at 02:33, barb wrote: > Hey guys, > > i have a strange problem reading a .csv file. > Seems not to be covered by the usual read.csv techniques. > > The relevant data i want to use, seems to be saved as the label of the > data > point. > Therefore i can not really use it > > > spec<-"EU2001" > part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_" > part2<-"&csvformat=de&euro=mixed&tr=" > tmp<-tempfile() > load<-paste(part1,spec,part2,spec,sep="") > download.file(load,tmp) > file<-read.csv(tmp,sep=";",dec=",", skip="5") > (relevant<-file[,2][1]) It seems to me that there is a problem with conversion from data to known type - the last two lines contains comments instead of data and first column type is not recognized. You can supress all conversions, remove problematic lines and then make conversion manually or import only relevant lines and specify types. For example: file<-read.csv(tmp, sep=";", dec=",",skip=5,header=FALSE,nrows=495,colClasses=c("character","numeric","NULL","NULL")) -- Z pozdrowieniami, Krzysztof Mitko ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by barb
On May 14, 2012, at 5:33 AM, barb wrote: > Hey guys, > > i have a strange problem reading a .csv file. > Seems not to be covered by the usual read.csv techniques. > > The relevant data i want to use, seems to be saved as the label of > the data > point. > Therefore i can not really use it > > > spec<-"EU2001" > part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_ > " > part2<-"&csvformat=de&euro=mixed&tr=" > tmp<-tempfile() > load<-paste(part1,spec,part2,spec,sep="") > download.file(load,tmp) > file<-read.csv(tmp,sep=";",dec=",", skip="5") > (relevant<-file[,2][1]) > If dec="," then you probably need read.csv2() (Since dec="," is the default I would remove that argument from the call. It seemed to succeed ) file<-read.csv2(tmp,sep=";", skip="5") (relevant<-file[,2][1]) [1] 10716,05 496 Levels: 10323,52 10391,38 10716,05 10929,62 11051,23 11329,50 11380,11 ... Methodik: Ab Januar 1993 einschl. der Zuschätzungen für nichtmelde- pflichtigen Außenhandel, die bis Dezember 1992 in den Ergänzungen zum Außenhandel enthalten sind. -- David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Hey David,
thanks for your fast reply, i really appreciate that you answer so many posts. Unfortunately it´s not that easy. Try to operate with the output: e.g file<-read.csv2(tmp,sep=";",skip="5") a<-(relevant<-file[,2][1]) a*5 # or as.numeric(relevant<-file[,2][1]) a is saved in the workspace as a factor and the values i actually need are saved as the labels. (therefore my subject) Thank You! |
|
On May 14, 2012, at 11:23 AM, barb wrote: > Hey David, > > thanks for your fast reply, i really appreciate that you answer so > many > posts. > > Unfortunately it´s not that easy. Try to operate with the output: > > e.g > file<-read.csv2(tmp,sep=";",skip="5") > a<-(relevant<-file[,2][1]) > a*5 > # or > as.numeric(relevant<-file[,2][1]) > > a is saved in the workspace as a factor and the values i actually > need are > saved as the labels. > (therefore my subject) Your subject line asked for "labels". That is not a word that represents anything specific in R parlance except perhaps plotting function arguments. It you want to prevent the conversion of "character" values to factors then you should be using stringsAsFactors=FALSE in the read functions. If you want to convert from factor to character correctly, you could also refer to the FAQ. On my machine the section "7.10 How do I convert factors to numeric?" is located at: http://127.0.0.1:13702/doc/manual/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f You should have a similar copy of the FAQ someplace on your machine. It's good to review the "miscellaneous" section a couple of times. > > Thank You! > > -- > View this message in context: http://r.789695.n4.nabble.com/Data-read-as-labels-tp4629901p4629951.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
This post was updated on .
Hey David,
i tried all this - it doesn´t work :( file<-read.csv2(tmp,sep=";",skip="5") # or file<-read.csv2(tmp,sep=";",skip="5",stringsAsFactors=FALSE) a<-(relevant<-file[,2]) clean <- as.numeric(levels(a))[as.integer(a)] clean<-as.numeric(as.character(a)) i often use noquote and strsplit and then convert data, but i never dealed with that kind of data and it drives me crazy =) |
|
In reply to this post by barb
Hello,
Your data.frame has some noise in the last two rows. See if this works. #------------------- this is your code --------------- spec <- "EU2001" part1 <- "http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_" part2 <- "&csvformat=de&euro=mixed&tr=" tmp <- tempfile() load <- paste(part1, spec, part2, spec, sep="") download.file(load,tmp) # read it in, no conversion from strings to factors file <- read.csv(tmp, sep=";", dec=",", skip="5", stringsAsFactors=FALSE) # see it str(file) head(file) tail(file) # -----> problem # last two rows are messed up nr <- nrow(file) # see it without them tail(file[ -c(nr - 1, nr), ]) # remove the last two rows fl <- file[ -c(nr - 1, nr), ] (relevant <- fl[, 2]) Also, 'file' is the name of an R function, use something else, it can be confusing. Hope this helps, Rui Barradas Em 15-05-2012 11:00, [hidden email] escreveu: > Date: Mon, 14 May 2012 02:33:54 -0700 (PDT) > From: barb<[hidden email]> > To:[hidden email] > Subject: [R] Data read as labels > Message-ID:<[hidden email]> > Content-Type: text/plain; charset=us-ascii > > Hey guys, > > i have a strange problem reading a .csv file. > Seems not to be covered by the usual read.csv techniques. > > The relevant data i want to use, seems to be saved as the label of the data > point. > Therefore i can not really use it > > > spec<-"EU2001" > part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_" > part2<-"&csvformat=de&euro=mixed&tr=" > tmp<-tempfile() > load<-paste(part1,spec,part2,spec,sep="") > download.file(load,tmp) > file<-read.csv(tmp,sep=";",dec=",", skip="5") > (relevant<-file[,2][1]) ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by barb
On May 15, 2012, at 11:13 AM, barb wrote: > Hey David, > > i tried all this - it doesn´t work :( Sadder and less informative words were never written! Learn to express in natural language or in code what you were expecting rather than use the phrase "doesn't work" which can mean one of an almost infinite sources of programming failure. > > file<-read.csv2(tmp,sep=";",skip="5") # or > file<-read.csv2(tmp,sep=";",skip="5",stringsAsFactors=FALSE)a<- > (relevant<-file[,2]) > clean <- as.numeric(levels(a))[as.integer(a)] > clean<-as.numeric(as.character(a)) > When I do that, I get a series of comma-separated digit values inside strings (because the default setting for the decimal separator is period = Punct auf Deutsch if I remember my lessons from 40 years ago, "." and not Komma ) so at the end I get: .... [490] "94801,00" [491] "85013,00" [492] "85982,00" [493] "91213,00" [494] "98912,00" [495] "Bemerkung: " [496] "Methodik: Ab Januar 1993 einschl. der Zuschätzungen für nichtmelde- pflichtigen Außenhandel, die bis Dezember 1992 in den Ergänzungen zum Außenhandel enthalten sind." So what is "not working" and what would be "success"? Is this successful? > aconv <- sub("\\,", ".", a) > str(as.numeric(aconv)) num [1:496] 10716 12897 11330 10930 11485 ... Warning message: In str(as.numeric(aconv)) : NAs introduced by coercion > as.numeric(aconv[480:494]) [1] 78645.63 84067.37 98180.37 84252.25 92003.35 88139.63 85664.93 85138.00 94960.00 [10] 89170.00 94801.00 85013.00 85982.00 91213.00 98912.00 > > i often use noquote and strsplit and then convert data, but i never > dealed > with that kind of data > and it drives me crazy =) > > -- > View this message in context: http://r.789695.n4.nabble.com/Data-read-as-labels-tp4629901p4630112.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
that´s a success. "Das ist ein Erfolg". :) Maybe you had that in your german lessons, too ;)
So if you are still interested in learning our language, feel free to ask me. I really learned a lot in this forum, so i am always happy to be able to give something back |
| Powered by Nabble | Edit this page |
