readxl question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

readxl question

Thomas Subia-2
Colleagues,

I'm trying to extract a cell from all Excel files in a directory.

library(readxl)
files <- list.files(pattern="*.xls", full.names = FALSE)

date <- lapply(files, read_excel, sheet="Sheet1", range=("B5"))

date_df <- as.data.frame(date)
trans_date <-t(date_df)
mydates <- list(trans_date)
write.table(mydates,"mydates.txt",sep="\t")

Looking at mydates.txt shows:

""
"Saturday..June.09..2018"
"Saturday..June.09..2018.1"
"Saturday..June.09..2018.2"

But the original Excel contents are:

Saturday, June 09, 2018
Saturday, June 09, 2018
Saturday, June 09, 2018

I get a similar problem with my serial numbers

serial <-lapply(files, read_excel, sheet="Sheet1", range=("B9"))

serial_df <- as.data.frame(serial)
trans_serial <-t(serial_df)
myserials <- list(trans_serial)
write.table(myserials,"myserials.txt",sep="\t")

R Output

""
"X96739.0027.1"
"X96739.0041.1"
"X96739.0044.1"

Original Excel Content
96739-0027/1
96739-0041/1
96739-0044/1

How can I amend my script so that the output matches the original Excel content?

Thomas Subia
Statistician / Senior Quality Engineer
IMG Precision

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: readxl question

Ivan Krylov
On Thu, 5 Dec 2019 15:39:56 +0000
Thomas Subia <[hidden email]> wrote:

> date <- lapply(files, read_excel, sheet="Sheet1", range=("B5"))
> date_df <- as.data.frame(date)
> trans_date <-t(date_df)
> mydates <- list(trans_date)

This feels a bit excessive for what looks like a one-dimensional string
vector. Why is it needed? Can you get better results with sapply or
vapply (which return vectors, not lists)?

In particular, as.data.frame might be responsible for the name
mangling. Also, your data seems to end up inside the row names. Try
using str() on every step of the transformation to check if that is the
case.

Also check out the .name_repair argument of the read_excel function,
but I think that as.data.frame is part of the problem.

--
Best regards,
Ivan

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] readxl question

Rolf Turner
In reply to this post by Thomas Subia-2

The best advice that anyone could give:

See fortunes::fortune("Friends") .

cheers,

Rolf Turner

On 6/12/19 4:39 am, Thomas Subia wrote:

> Colleagues,
>
> I'm trying to extract a cell from all Excel files in a directory.
>
> library(readxl)
> files <- list.files(pattern="*.xls", full.names = FALSE)
>
> date <- lapply(files, read_excel, sheet="Sheet1", range=("B5"))
>
> date_df <- as.data.frame(date)
> trans_date <-t(date_df)
> mydates <- list(trans_date)
> write.table(mydates,"mydates.txt",sep="\t")
>
> Looking at mydates.txt shows:
>
> ""
> "Saturday..June.09..2018"
> "Saturday..June.09..2018.1"
> "Saturday..June.09..2018.2"
>
> But the original Excel contents are:
>
> Saturday, June 09, 2018
> Saturday, June 09, 2018
> Saturday, June 09, 2018
>
> I get a similar problem with my serial numbers
>
> serial <-lapply(files, read_excel, sheet="Sheet1", range=("B9"))
>
> serial_df <- as.data.frame(serial)
> trans_serial <-t(serial_df)
> myserials <- list(trans_serial)
> write.table(myserials,"myserials.txt",sep="\t")
>
> R Output
>
> ""
> "X96739.0027.1"
> "X96739.0041.1"
> "X96739.0044.1"
>
> Original Excel Content
> 96739-0027/1
> 96739-0041/1
> 96739-0044/1
>
> How can I amend my script so that the output matches the original Excel content?

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.