how to locate specific line?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

how to locate specific line?

Jinsong Zhao-2
Hi there,

I have a large/huge text file. I need to locate a line in the file with
a specific string, for example, "Data Points". Now, I use the following
code to do:

df <- readLines(file)
l <- grep("Data Points", df)

However, in this case, the file will be read throughout into R. When the
file is huge, it will cost much memory and time.

Is there any more elegant way to do that? Thanks.

Best,
Jinsong

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to locate specific line?

Jérémie Juste

Hello,

If you need to go through R the function fread of data.table
can speed up the data import.

If not and you work on a gnu/linux distro may be awk might help
https://stackoverflow.com/questions/5536018/how-to-print-matched-regex-pattern-using-awk

HTH,
Jeremie

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to locate specific line?

R help mailing list-2
In reply to this post by Jinsong Zhao-2
If Sys.which("grep") says that grep is available then system("grep -n ...")
will do it.

> cat(c("One","Two","Three","Four"),sep="\n",file=tf<-tempfile())
> system(paste("grep --line-number", shQuote("^T"), shQuote(tf)),
intern=TRUE)
[1] "2:Two"   "3:Three"
> as.integer(sub(":.*$", "", .Last.value))
[1] 2 3

grep is always on Unix-like systems and is in the Rtools and Cygwin
distributions on Windows.  I don't know how standard the '-n' (aka
--line-number) flag is.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Jul 27, 2018 at 1:13 AM, Jinsong Zhao <[hidden email]> wrote:

> Hi there,
>
> I have a large/huge text file. I need to locate a line in the file with a
> specific string, for example, "Data Points". Now, I use the following code
> to do:
>
> df <- readLines(file)
> l <- grep("Data Points", df)
>
> However, in this case, the file will be read throughout into R. When the
> file is huge, it will cost much memory and time.
>
> Is there any more elegant way to do that? Thanks.
>
> Best,
> Jinsong
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.