Reorder file names read by list.files function

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Reorder file names read by list.files function

Ek Esawi
Hi All--

I used base R list.file function to read files from a directory. The
file names are months (April, August, etc). That's the system reads
them in alphabetical order., but i want to reordered them in calendar
order (January, February, ...December).. I thought i might be able to
do it via RegEx or possibly gtools package, I am wondering if there is
an easier way.

Thanks--EK

Example
path = "C:/Users/name/Downloads/MyFiles"
file.names <- dir(path, pattern =".PDF")

Example output
Output:
"February.PDF"  "January.PDF" "March.PDF"
Desired output
"January.PDF"  "February.PDF" "March.PDF"

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

PIKAL Petr
Hi

You could use brute force approach. Just print out "file.names" and estimate ordering vector.
In czech locale it is

oo <- c(6, 11, 1, 4, 5, 2, 3, 10, 12, 9, 7, 8)

In english locale it is different :-)

After that
file.names[oo]

should give you correct order of file names

Cheers
Petr

> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of Ek Esawi
> Sent: Tuesday, October 9, 2018 3:44 PM
> To: [hidden email]
> Subject: [R] Reorder file names read by list.files function
>
> Hi All--
>
> I used base R list.file function to read files from a directory. The file names are
> months (April, August, etc). That's the system reads them in alphabetical order.,
> but i want to reordered them in calendar order (January, February,
> ...December).. I thought i might be able to do it via RegEx or possibly gtools
> package, I am wondering if there is an easier way.
>
> Thanks--EK
>
> Example
> path = "C:/Users/name/Downloads/MyFiles"
> file.names <- dir(path, pattern =".PDF")
>
> Example output
> Output:
> "February.PDF"  "January.PDF" "March.PDF"
> Desired output
> "January.PDF"  "February.PDF" "March.PDF"
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Rui Barradas
In reply to this post by Ek Esawi
Hello,

You can use the built in variable month.name to get the calendar order
and match it with your file names.


i <- match(sub("\\.PDF", "", file.names), month.name)
file.names[i]
#[1] "January.PDF"  "February.PDF" "March.PDF"


Hope this helps,

Rui Barradas


Às 14:44 de 09/10/2018, Ek Esawi escreveu:

> Hi All--
>
> I used base R list.file function to read files from a directory. The
> file names are months (April, August, etc). That's the system reads
> them in alphabetical order., but i want to reordered them in calendar
> order (January, February, ...December).. I thought i might be able to
> do it via RegEx or possibly gtools package, I am wondering if there is
> an easier way.
>
> Thanks--EK
>
> Example
> path = "C:/Users/name/Downloads/MyFiles"
> file.names <- dir(path, pattern =".PDF")
>
> Example output
> Output:
> "February.PDF"  "January.PDF" "March.PDF"
> Desired output
> "January.PDF"  "February.PDF" "March.PDF"
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Jeff Newmiller
In reply to this post by Ek Esawi
Instead of changing the order in which you read the files, perhaps your analysis will work if you sort the data after you read it in. This may require that you add the month names as a column in the data frames, or you may already have dates in the data that you could sort by.

One idea:

fnames <- paste0( month.name, ".PDF" )
resultdf <- do.call( rbind, lapply(fnames, function(fn) { read.csv( file.path( "datadir", fn ), as.is=TRUE ) } )

but that only works if there are exactly 12 files. If there could be fewer, perhaps:

fnames <- list.files( "datadir" )
sfnames <- fnames[ match( sub("\\.PDF", "", fnames ), month.name ) ]


On October 9, 2018 6:44:21 AM PDT, Ek Esawi <[hidden email]> wrote:

>Hi All--
>
>I used base R list.file function to read files from a directory. The
>file names are months (April, August, etc). That's the system reads
>them in alphabetical order., but i want to reordered them in calendar
>order (January, February, ...December).. I thought i might be able to
>do it via RegEx or possibly gtools package, I am wondering if there is
>an easier way.
>
>Thanks--EK
>
>Example
>path = "C:/Users/name/Downloads/MyFiles"
>file.names <- dir(path, pattern =".PDF")
>
>Example output
>Output:
>"February.PDF"  "January.PDF" "March.PDF"
>Desired output
>"January.PDF"  "February.PDF" "March.PDF"
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Jeff Newmiller
In reply to this post by Ek Esawi
Instead of changing the order in which you read the files, perhaps your analysis will work if you sort the data after you read it in. This may require that you add the month names as a column in the data frames, or you may already have dates in the data that you could sort by.

One idea:

fnames <- paste0( month.name, ".PDF" )
resultdf <- do.call( rbind, lapply(fnames, function(fn) { read.csv( file.path( "datadir", fn ), as.is=TRUE ) } )

but that only works if there are exactly 12 files. If there could be fewer, perhaps:

fnames <- list.files( "datadir" )
sfnames <- fnames[ match( sub("\\.PDF", "", fnames ), month.name ) ]


On October 9, 2018 6:44:21 AM PDT, Ek Esawi <[hidden email]> wrote:

>Hi All--
>
>I used base R list.file function to read files from a directory. The
>file names are months (April, August, etc). That's the system reads
>them in alphabetical order., but i want to reordered them in calendar
>order (January, February, ...December).. I thought i might be able to
>do it via RegEx or possibly gtools package, I am wondering if there is
>an easier way.
>
>Thanks--EK
>
>Example
>path = "C:/Users/name/Downloads/MyFiles"
>file.names <- dir(path, pattern =".PDF")
>
>Example output
>Output:
>"February.PDF"  "January.PDF" "March.PDF"
>Desired output
>"January.PDF"  "February.PDF" "March.PDF"
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Ek Esawi
In reply to this post by Ek Esawi
Hi again,

I worked with RUi's idea of using the match function with month.name.
I got numerical values for months then i sorted and pasted the PDF
file extension. It gave me the file order i wanted, but now statements
8,9,&10 don't work and i kept getting an error which is listed below.
The dilemma is if i add full.names=TRUE in statement 6 then statements
9 and 10 don't produce what they did earlier. If i put
full.names=FALSE, then i am back to square 1.
Any idea is greatly appreciated.:

The code

1. nstall.packages("tabulizer")
2. installed.packages("stringr")
3. library(stringr)
4. library(tabulizer)
5. path = "C:/Users/namei/Documents/TextMining/S2017"
6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
9. FNs1 <- paste0(month.name[FNs],".","PDF")
10 A <- lapply(FNs1, function(i) extract_tables(i))

Output and the error message.

path = "C:/Users/eesawi/Documents/TextMining/S2017"
> file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> FNs1 <- paste0(month.name[FNs],".","PDF")
> A <- lapply(FNs1, function(i) extract_tables(i))
 Show Traceback

 Error in normalizePath(path.expand(path), winslash, mustWork) :
  path[1]=".PDF": The system cannot find the file specified
On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:

>
> Hi All--
>
> I used base R list.file function to read files from a directory. The
> file names are months (April, August, etc). That's the system reads
> them in alphabetical order., but i want to reordered them in calendar
> order (January, February, ...December).. I thought i might be able to
> do it via RegEx or possibly gtools package, I am wondering if there is
> an easier way.
>
> Thanks--EK
>
> Example
> path = "C:/Users/name/Downloads/MyFiles"
> file.names <- dir(path, pattern =".PDF")
>
> Example output
> Output:
> "February.PDF"  "January.PDF" "March.PDF"
> Desired output
> "January.PDF"  "February.PDF" "March.PDF"

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Rui Barradas
Hello,

I would do something along the lines of

# work in the directory where the files are located
old_dir <- setwd(path)
file.names <- list.files(pattern = "\\.PDF")

[...]

# When you are done reset your wd
setwd(old_dir)


Hope this helps,

Rui Barradas

Às 21:38 de 09/10/2018, Ek Esawi escreveu:

> Hi again,
>
> I worked with RUi's idea of using the match function with month.name.
> I got numerical values for months then i sorted and pasted the PDF
> file extension. It gave me the file order i wanted, but now statements
> 8,9,&10 don't work and i kept getting an error which is listed below.
> The dilemma is if i add full.names=TRUE in statement 6 then statements
> 9 and 10 don't produce what they did earlier. If i put
> full.names=FALSE, then i am back to square 1.
> Any idea is greatly appreciated.:
>
> The code
>
> 1. nstall.packages("tabulizer")
> 2. installed.packages("stringr")
> 3. library(stringr)
> 4. library(tabulizer)
> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
> 10 A <- lapply(FNs1, function(i) extract_tables(i))
>
> Output and the error message.
>
> path = "C:/Users/eesawi/Documents/TextMining/S2017"
>> file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>> file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>> FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>> FNs1 <- paste0(month.name[FNs],".","PDF")
>> A <- lapply(FNs1, function(i) extract_tables(i))
>   Show Traceback
>
>   Error in normalizePath(path.expand(path), winslash, mustWork) :
>    path[1]=".PDF": The system cannot find the file specified
> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:
>>
>> Hi All--
>>
>> I used base R list.file function to read files from a directory. The
>> file names are months (April, August, etc). That's the system reads
>> them in alphabetical order., but i want to reordered them in calendar
>> order (January, February, ...December).. I thought i might be able to
>> do it via RegEx or possibly gtools package, I am wondering if there is
>> an easier way.
>>
>> Thanks--EK
>>
>> Example
>> path = "C:/Users/name/Downloads/MyFiles"
>> file.names <- dir(path, pattern =".PDF")
>>
>> Example output
>> Output:
>> "February.PDF"  "January.PDF" "March.PDF"
>> Desired output
>> "January.PDF"  "February.PDF" "March.PDF"
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

R help mailing list-2
In reply to this post by Ek Esawi
Use basename(filename) to remove the lead parts of the full path to the
file.  E.g., replace
   FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
with (the untested)
    FNs <- sort(match(sub("\\.PDF", "", basename(file.names)), month.name))

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <[hidden email]> wrote:

> Hi again,
>
> I worked with RUi's idea of using the match function with month.name.
> I got numerical values for months then i sorted and pasted the PDF
> file extension. It gave me the file order i wanted, but now statements
> 8,9,&10 don't work and i kept getting an error which is listed below.
> The dilemma is if i add full.names=TRUE in statement 6 then statements
> 9 and 10 don't produce what they did earlier. If i put
> full.names=FALSE, then i am back to square 1.
> Any idea is greatly appreciated.:
>
> The code
>
> 1. nstall.packages("tabulizer")
> 2. installed.packages("stringr")
> 3. library(stringr)
> 4. library(tabulizer)
> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
> 10 A <- lapply(FNs1, function(i) extract_tables(i))
>
> Output and the error message.
>
> path = "C:/Users/eesawi/Documents/TextMining/S2017"
> > file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> > file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> > FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> > FNs1 <- paste0(month.name[FNs],".","PDF")
> > A <- lapply(FNs1, function(i) extract_tables(i))
>  Show Traceback
>
>  Error in normalizePath(path.expand(path), winslash, mustWork) :
>   path[1]=".PDF": The system cannot find the file specified
> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:
> >
> > Hi All--
> >
> > I used base R list.file function to read files from a directory. The
> > file names are months (April, August, etc). That's the system reads
> > them in alphabetical order., but i want to reordered them in calendar
> > order (January, February, ...December).. I thought i might be able to
> > do it via RegEx or possibly gtools package, I am wondering if there is
> > an easier way.
> >
> > Thanks--EK
> >
> > Example
> > path = "C:/Users/name/Downloads/MyFiles"
> > file.names <- dir(path, pattern =".PDF")
> >
> > Example output
> > Output:
> > "February.PDF"  "January.PDF" "March.PDF"
> > Desired output
> > "January.PDF"  "February.PDF" "March.PDF"
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Ek Esawi
In reply to this post by Jeff Newmiller
Thank you Jeff. It is an excellent idea and i might try it out if
nothing works out. And i  don't have 12 files on each sub directory;

EK
On Tue, Oct 9, 2018 at 11:30 AM Jeff Newmiller <[hidden email]> wrote:

>
> Instead of changing the order in which you read the files, perhaps your analysis will work if you sort the data after you read it in. This may require that you add the month names as a column in the data frames, or you may already have dates in the data that you could sort by.
>
> One idea:
>
> fnames <- paste0( month.name, ".PDF" )
> resultdf <- do.call( rbind, lapply(fnames, function(fn) { read.csv( file.path( "datadir", fn ), as.is=TRUE ) } )
>
> but that only works if there are exactly 12 files. If there could be fewer, perhaps:
>
> fnames <- list.files( "datadir" )
> sfnames <- fnames[ match( sub("\\.PDF", "", fnames ), month.name ) ]
>
>
> On October 9, 2018 6:44:21 AM PDT, Ek Esawi <[hidden email]> wrote:
> >Hi All--
> >
> >I used base R list.file function to read files from a directory. The
> >file names are months (April, August, etc). That's the system reads
> >them in alphabetical order., but i want to reordered them in calendar
> >order (January, February, ...December).. I thought i might be able to
> >do it via RegEx or possibly gtools package, I am wondering if there is
> >an easier way.
> >
> >Thanks--EK
> >
> >Example
> >path = "C:/Users/name/Downloads/MyFiles"
> >file.names <- dir(path, pattern =".PDF")
> >
> >Example output
> >Output:
> >"February.PDF"  "January.PDF" "March.PDF"
> >Desired output
> >"January.PDF"  "February.PDF" "March.PDF"
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Ek Esawi
In reply to this post by R help mailing list-2
Thank you Bill and RUI. I use month.name with sort and basename, as
suggested by Bill. i got the sorted numerical values, then i use
month.name to get proper ordered month names. The problem is that i
have to paste to the names the extension PDF giving me the correct
ordered file names, but then i get the same error message which
suggest that the code is not reading the files properly

I have not tried RUI's yet, but i will if nothing else works out.

Thanks again--EK

had to strip off file.names from the extension PDF, but when i paste
the month.name with .PDF to get the correct file names, i am getting
the same error.
On Tue, Oct 9, 2018 at 4:47 PM William Dunlap <[hidden email]> wrote:

>
> Use basename(filename) to remove the lead parts of the full path to the file.  E.g., replace
>    FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> with (the untested)
>     FNs <- sort(match(sub("\\.PDF", "", basename(file.names)), month.name))
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <[hidden email]> wrote:
>>
>> Hi again,
>>
>> I worked with RUi's idea of using the match function with month.name.
>> I got numerical values for months then i sorted and pasted the PDF
>> file extension. It gave me the file order i wanted, but now statements
>> 8,9,&10 don't work and i kept getting an error which is listed below.
>> The dilemma is if i add full.names=TRUE in statement 6 then statements
>> 9 and 10 don't produce what they did earlier. If i put
>> full.names=FALSE, then i am back to square 1.
>> Any idea is greatly appreciated.:
>>
>> The code
>>
>> 1. nstall.packages("tabulizer")
>> 2. installed.packages("stringr")
>> 3. library(stringr)
>> 4. library(tabulizer)
>> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
>> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
>> 10 A <- lapply(FNs1, function(i) extract_tables(i))
>>
>> Output and the error message.
>>
>> path = "C:/Users/eesawi/Documents/TextMining/S2017"
>> > file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>> > file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>> > FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>> > FNs1 <- paste0(month.name[FNs],".","PDF")
>> > A <- lapply(FNs1, function(i) extract_tables(i))
>>  Show Traceback
>>
>>  Error in normalizePath(path.expand(path), winslash, mustWork) :
>>   path[1]=".PDF": The system cannot find the file specified
>> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:
>> >
>> > Hi All--
>> >
>> > I used base R list.file function to read files from a directory. The
>> > file names are months (April, August, etc). That's the system reads
>> > them in alphabetical order., but i want to reordered them in calendar
>> > order (January, February, ...December).. I thought i might be able to
>> > do it via RegEx or possibly gtools package, I am wondering if there is
>> > an easier way.
>> >
>> > Thanks--EK
>> >
>> > Example
>> > path = "C:/Users/name/Downloads/MyFiles"
>> > file.names <- dir(path, pattern =".PDF")
>> >
>> > Example output
>> > Output:
>> > "February.PDF"  "January.PDF" "March.PDF"
>> > Desired output
>> > "January.PDF"  "February.PDF" "March.PDF"
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Duncan Murdoch-2
On 10/10/2018 7:23 PM, Ek Esawi wrote:
> Thank you Bill and RUI. I use month.name with sort and basename, as
> suggested by Bill. i got the sorted numerical values, then i use
> month.name to get proper ordered month names. The problem is that i
> have to paste to the names the extension PDF giving me the correct
> ordered file names, but then i get the same error message which
> suggest that the code is not reading the files properly

You shouldn't need to do any pasting.  Extract the months, use the
order() function to find their proper order, then apply that vector to
the original vector of filenames.

Duncan Murdoch

>
> I have not tried RUI's yet, but i will if nothing else works out.
>
> Thanks again--EK
>
> had to strip off file.names from the extension PDF, but when i paste
> the month.name with .PDF to get the correct file names, i am getting
> the same error.
> On Tue, Oct 9, 2018 at 4:47 PM William Dunlap <[hidden email]> wrote:
>>
>> Use basename(filename) to remove the lead parts of the full path to the file.  E.g., replace
>>     FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>> with (the untested)
>>      FNs <- sort(match(sub("\\.PDF", "", basename(file.names)), month.name))
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <[hidden email]> wrote:
>>>
>>> Hi again,
>>>
>>> I worked with RUi's idea of using the match function with month.name.
>>> I got numerical values for months then i sorted and pasted the PDF
>>> file extension. It gave me the file order i wanted, but now statements
>>> 8,9,&10 don't work and i kept getting an error which is listed below.
>>> The dilemma is if i add full.names=TRUE in statement 6 then statements
>>> 9 and 10 don't produce what they did earlier. If i put
>>> full.names=FALSE, then i am back to square 1.
>>> Any idea is greatly appreciated.:
>>>
>>> The code
>>>
>>> 1. nstall.packages("tabulizer")
>>> 2. installed.packages("stringr")
>>> 3. library(stringr)
>>> 4. library(tabulizer)
>>> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
>>> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>>> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>>> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
>>> 10 A <- lapply(FNs1, function(i) extract_tables(i))
>>>
>>> Output and the error message.
>>>
>>> path = "C:/Users/eesawi/Documents/TextMining/S2017"
>>>> file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>>>> file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>>>> FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>>> FNs1 <- paste0(month.name[FNs],".","PDF")
>>>> A <- lapply(FNs1, function(i) extract_tables(i))
>>>   Show Traceback
>>>
>>>   Error in normalizePath(path.expand(path), winslash, mustWork) :
>>>    path[1]=".PDF": The system cannot find the file specified
>>> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:
>>>>
>>>> Hi All--
>>>>
>>>> I used base R list.file function to read files from a directory. The
>>>> file names are months (April, August, etc). That's the system reads
>>>> them in alphabetical order., but i want to reordered them in calendar
>>>> order (January, February, ...December).. I thought i might be able to
>>>> do it via RegEx or possibly gtools package, I am wondering if there is
>>>> an easier way.
>>>>
>>>> Thanks--EK
>>>>
>>>> Example
>>>> path = "C:/Users/name/Downloads/MyFiles"
>>>> file.names <- dir(path, pattern =".PDF")
>>>>
>>>> Example output
>>>> Output:
>>>> "February.PDF"  "January.PDF" "March.PDF"
>>>> Desired output
>>>> "January.PDF"  "February.PDF" "March.PDF"
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

R help mailing list-2
In reply to this post by Ek Esawi
You can paste the directory names, dir.names(files), back on, with
file.path(), after you do the sorting.  A better idiom is to use order()
instead of sort() and usng order's output to subscript file.names.  E.g.,
the following sorts by year and month number.

> file.names <- c("C:/tmp/June_2018.PDF", "C:/tmp/May_2018.PDF",
"C:/tmp/October_2016.PDF")
> bfile.names <- sub("\\..*$", "", basename(file.names))
> bfile.names
[1] "June_2018"    "May_2018"     "October_2016"
> month <- sub("^([[:alpha:]]+)_.*$", "\\1", bfile.names)
> month
[1] "June"    "May"     "October"
> month.names
Error: object 'month.names' not found
> month.names <-
c("January","February","March","April","May","June","July","August","September","October","November","December")
> month.number <- match(month, month.names)
> month.number
[1]  6  5 10
> file.names[ order(year, month.number) ]
[1] "C:/tmp/October_2016.PDF" "C:/tmp/May_2018.PDF"
 "C:/tmp/June_2018.PDF"




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Oct 10, 2018 at 4:23 PM, Ek Esawi <[hidden email]> wrote:

> Thank you Bill and RUI. I use month.name with sort and basename, as
> suggested by Bill. i got the sorted numerical values, then i use
> month.name to get proper ordered month names. The problem is that i
> have to paste to the names the extension PDF giving me the correct
> ordered file names, but then i get the same error message which
> suggest that the code is not reading the files properly
>
> I have not tried RUI's yet, but i will if nothing else works out.
>
> Thanks again--EK
>
> had to strip off file.names from the extension PDF, but when i paste
> the month.name with .PDF to get the correct file names, i am getting
> the same error.
> On Tue, Oct 9, 2018 at 4:47 PM William Dunlap <[hidden email]> wrote:
> >
> > Use basename(filename) to remove the lead parts of the full path to the
> file.  E.g., replace
> >    FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> > with (the untested)
> >     FNs <- sort(match(sub("\\.PDF", "", basename(file.names)),
> month.name))
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <[hidden email]> wrote:
> >>
> >> Hi again,
> >>
> >> I worked with RUi's idea of using the match function with month.name.
> >> I got numerical values for months then i sorted and pasted the PDF
> >> file extension. It gave me the file order i wanted, but now statements
> >> 8,9,&10 don't work and i kept getting an error which is listed below.
> >> The dilemma is if i add full.names=TRUE in statement 6 then statements
> >> 9 and 10 don't produce what they did earlier. If i put
> >> full.names=FALSE, then i am back to square 1.
> >> Any idea is greatly appreciated.:
> >>
> >> The code
> >>
> >> 1. nstall.packages("tabulizer")
> >> 2. installed.packages("stringr")
> >> 3. library(stringr)
> >> 4. library(tabulizer)
> >> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
> >> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> >> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> >> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> >> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
> >> 10 A <- lapply(FNs1, function(i) extract_tables(i))
> >>
> >> Output and the error message.
> >>
> >> path = "C:/Users/eesawi/Documents/TextMining/S2017"
> >> > file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> >> > file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> >> > FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> >> > FNs1 <- paste0(month.name[FNs],".","PDF")
> >> > A <- lapply(FNs1, function(i) extract_tables(i))
> >>  Show Traceback
> >>
> >>  Error in normalizePath(path.expand(path), winslash, mustWork) :
> >>   path[1]=".PDF": The system cannot find the file specified
> >> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:
> >> >
> >> > Hi All--
> >> >
> >> > I used base R list.file function to read files from a directory. The
> >> > file names are months (April, August, etc). That's the system reads
> >> > them in alphabetical order., but i want to reordered them in calendar
> >> > order (January, February, ...December).. I thought i might be able to
> >> > do it via RegEx or possibly gtools package, I am wondering if there is
> >> > an easier way.
> >> >
> >> > Thanks--EK
> >> >
> >> > Example
> >> > path = "C:/Users/name/Downloads/MyFiles"
> >> > file.names <- dir(path, pattern =".PDF")
> >> >
> >> > Example output
> >> > Output:
> >> > "February.PDF"  "January.PDF" "March.PDF"
> >> > Desired output
> >> > "January.PDF"  "February.PDF" "March.PDF"
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Rui Barradas
Hello,


 > month.names
Erro: objeto 'month.names' não encontrado
 > month.name
  [1] "January"   "February"  "March"     "April"     "May"       "June"

  [7] "July"      "August"    "September" "October"   "November"  "December"


Hope this helps,

Rui Barradas

Às 01:05 de 11/10/2018, William Dunlap via R-help escreveu:

> You can paste the directory names, dir.names(files), back on, with
> file.path(), after you do the sorting.  A better idiom is to use order()
> instead of sort() and usng order's output to subscript file.names.  E.g.,
> the following sorts by year and month number.
>
>> file.names <- c("C:/tmp/June_2018.PDF", "C:/tmp/May_2018.PDF",
> "C:/tmp/October_2016.PDF")
>> bfile.names <- sub("\\..*$", "", basename(file.names))
>> bfile.names
> [1] "June_2018"    "May_2018"     "October_2016"
>> month <- sub("^([[:alpha:]]+)_.*$", "\\1", bfile.names)
>> month
> [1] "June"    "May"     "October"
>> month.names
> Error: object 'month.names' not found
>> month.names <-
> c("January","February","March","April","May","June","July","August","September","October","November","December")
>> month.number <- match(month, month.names)
>> month.number
> [1]  6  5 10
>> file.names[ order(year, month.number) ]
> [1] "C:/tmp/October_2016.PDF" "C:/tmp/May_2018.PDF"
>   "C:/tmp/June_2018.PDF"
>
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Oct 10, 2018 at 4:23 PM, Ek Esawi <[hidden email]> wrote:
>
>> Thank you Bill and RUI. I use month.name with sort and basename, as
>> suggested by Bill. i got the sorted numerical values, then i use
>> month.name to get proper ordered month names. The problem is that i
>> have to paste to the names the extension PDF giving me the correct
>> ordered file names, but then i get the same error message which
>> suggest that the code is not reading the files properly
>>
>> I have not tried RUI's yet, but i will if nothing else works out.
>>
>> Thanks again--EK
>>
>> had to strip off file.names from the extension PDF, but when i paste
>> the month.name with .PDF to get the correct file names, i am getting
>> the same error.
>> On Tue, Oct 9, 2018 at 4:47 PM William Dunlap <[hidden email]> wrote:
>>>
>>> Use basename(filename) to remove the lead parts of the full path to the
>> file.  E.g., replace
>>>     FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>> with (the untested)
>>>      FNs <- sort(match(sub("\\.PDF", "", basename(file.names)),
>> month.name))
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <[hidden email]> wrote:
>>>>
>>>> Hi again,
>>>>
>>>> I worked with RUi's idea of using the match function with month.name.
>>>> I got numerical values for months then i sorted and pasted the PDF
>>>> file extension. It gave me the file order i wanted, but now statements
>>>> 8,9,&10 don't work and i kept getting an error which is listed below.
>>>> The dilemma is if i add full.names=TRUE in statement 6 then statements
>>>> 9 and 10 don't produce what they did earlier. If i put
>>>> full.names=FALSE, then i am back to square 1.
>>>> Any idea is greatly appreciated.:
>>>>
>>>> The code
>>>>
>>>> 1. nstall.packages("tabulizer")
>>>> 2. installed.packages("stringr")
>>>> 3. library(stringr)
>>>> 4. library(tabulizer)
>>>> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
>>>> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>>>> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>>>> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>>> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
>>>> 10 A <- lapply(FNs1, function(i) extract_tables(i))
>>>>
>>>> Output and the error message.
>>>>
>>>> path = "C:/Users/eesawi/Documents/TextMining/S2017"
>>>>> file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>>>>> file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>>>>> FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>>>> FNs1 <- paste0(month.name[FNs],".","PDF")
>>>>> A <- lapply(FNs1, function(i) extract_tables(i))
>>>>   Show Traceback
>>>>
>>>>   Error in normalizePath(path.expand(path), winslash, mustWork) :
>>>>    path[1]=".PDF": The system cannot find the file specified
>>>> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:
>>>>>
>>>>> Hi All--
>>>>>
>>>>> I used base R list.file function to read files from a directory. The
>>>>> file names are months (April, August, etc). That's the system reads
>>>>> them in alphabetical order., but i want to reordered them in calendar
>>>>> order (January, February, ...December).. I thought i might be able to
>>>>> do it via RegEx or possibly gtools package, I am wondering if there is
>>>>> an easier way.
>>>>>
>>>>> Thanks--EK
>>>>>
>>>>> Example
>>>>> path = "C:/Users/name/Downloads/MyFiles"
>>>>> file.names <- dir(path, pattern =".PDF")
>>>>>
>>>>> Example output
>>>>> Output:
>>>>> "February.PDF"  "January.PDF" "March.PDF"
>>>>> Desired output
>>>>> "January.PDF"  "February.PDF" "March.PDF"
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reorder file names read by list.files function

Ek Esawi
In reply to this post by Ek Esawi
Thank you all. Bill's original idea worked well. I did not realize
that i had to paste the full dir name to the correctly ordered file.
Once that was done it did work well. I will try REUI's idea  and i
think Jeff's idea of rearranging the output after extracting the
tables might work and i will try it and see.

Thank you all.

EK
On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <[hidden email]> wrote:

>
> Hi All--
>
> I used base R list.file function to read files from a directory. The
> file names are months (April, August, etc). That's the system reads
> them in alphabetical order., but i want to reordered them in calendar
> order (January, February, ...December).. I thought i might be able to
> do it via RegEx or possibly gtools package, I am wondering if there is
> an easier way.
>
> Thanks--EK
>
> Example
> path = "C:/Users/name/Downloads/MyFiles"
> file.names <- dir(path, pattern =".PDF")
>
> Example output
> Output:
> "February.PDF"  "January.PDF" "March.PDF"
> Desired output
> "January.PDF"  "February.PDF" "March.PDF"

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.