[R} how to build TermDocMatrix in tm text mining package of R

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[R} how to build TermDocMatrix in tm text mining package of R

Kum-Hoe Hwang
Howdy Gurus

I 'd like to ask a question about how to build TermDocMatrix in tm text
mining package.

It is not clear about importing a plain text file, and them converting that
text file into TermDocMatrix file, etc to me.
How can I build a TermDocMatrix of " a plain text document file for text
association?
Or are there any good manuals?

Thank you in advance,

--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : [hidden email]

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R} how to build TermDocMatrix in tm text mining package of R

Tony Breyal
Hi there, I think something like the following is what you want:

### R start...
# if you put your plain text files in a folder like this
my.path <- 'C:\\Documents and Settings\\tony\\Desktop\\texts\\'

# then you can construct a simple tdm like this
library(tm)
my.corpus <- Corpus(DirSource(my.path), readerControl = list
(reader=readPlain))
my.tdm <- TermDocMatrix(my.corpus)

# this show show how words are distributed in the first text document
my.tdm[1, ]
### R end.

by the way, there are some nice examples of using the tm package in
the last Rnews letter (Volume 8/2, October 2008), under the section
'An Introduction to Text Mining in R':
http://cran.r-project.org/doc/Rnews/Rnews_2008-2.pdf

Hope that helps a little bit,
Tony Breyal

On 9 Jan, 14:21, "Kum-Hoe Hwang" <[hidden email]> wrote:

> Howdy Gurus
>
> I 'd like to ask a question about how to build TermDocMatrix in tm text
> mining package.
>
> It is not clear about importing a plain text file, and them converting that
> text file into TermDocMatrix file, etc to me.
> How can I build a TermDocMatrix of " a plain text document file for text
> association?
> Or are there any good manuals?
>
> Thank you in advance,
>
> --
> Kum-Hoe Hwang, Ph.D.
>
> Phone : 82-31-250-3516
> Email : [hidden email]
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R} how to build TermDocMatrix in tm text mining package of R

Kum-Hoe Hwang
Thank your comments very much.

Thank to your help, I understood a flow for a text analysis.

However, I could not run the above R scripts because tm package does
not work in my PC that is a critical error.

Kum Hwang Ph.D.


On Sat, Jan 10, 2009 at 12:39 AM, Tony Breyal
<[hidden email]> wrote:

> Hi there, I think something like the following is what you want:
>
> ### R start...
> # if you put your plain text files in a folder like this
> my.path <- 'C:\\Documents and Settings\\tony\\Desktop\\texts\\'
>
> # then you can construct a simple tdm like this
> library(tm)
> my.corpus <- Corpus(DirSource(my.path), readerControl = list
> (reader=readPlain))
> my.tdm <- TermDocMatrix(my.corpus)
>
> # this show show how words are distributed in the first text document
> my.tdm[1, ]
> ### R end.
>
> by the way, there are some nice examples of using the tm package in
> the last Rnews letter (Volume 8/2, October 2008), under the section
> 'An Introduction to Text Mining in R':
> http://cran.r-project.org/doc/Rnews/Rnews_2008-2.pdf
>
> Hope that helps a little bit,
> Tony Breyal
>
> On 9 Jan, 14:21, "Kum-Hoe Hwang" <[hidden email]> wrote:
>> Howdy Gurus
>>
>> I 'd like to ask a question about how to build TermDocMatrix in tm text
>> mining package.
>>
>> It is not clear about importing a plain text file, and them converting that
>> text file into TermDocMatrix file, etc to me.
>> How can I build a TermDocMatrix of " a plain text document file for text
>> association?
>> Or are there any good manuals?
>>
>> Thank you in advance,
>>
>> --
>> Kum-Hoe Hwang, Ph.D.
>>
>> Phone : 82-31-250-3516
>> Email : [hidden email]
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.