SoS! How to predict new values using linear regression models?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

SoS! How to predict new values using linear regression models?

LosemindL
Hi all,

After trial and error by myself for a few hours, I decide to ask for your
help.

I have a training set which is a matrix of size 200 x 2, where the two
columns denote each independent variable. I have 200 observations.

-----------------
ss=data.frame(trainingSet);
result=lm(trainingClass~ss$X1+ss$X2);
-----------------

where trainingClass denotes the true classes of the training data.

Now I want to apply the model to predict new data:

-----------------
> gg=predict(result, data.frame(X1=1, X2=2))
Warning message:
'newdata' had 1 rows but variable(s) found have 200 rows
-----------------

That's to say, I provide a new data which is one observation of 2
independent variables(1 row, two columns). I converted it into data frame.

However, the R never gives me new predication value for this NEW ONE
observation. Instead, it keeps giving me the above warning and keeps
printing the fitted value for the 200 training samples...

That's very bad.

Please help me!

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: SoS! How to predict new values using linear regression models?

Gabor Grothendieck
Leaving aside the issue of whether linear regression is appropriate here,
do it like this where I have used the builtin iris data frame since I don't have
access to your ss:

iris.lm <- lm(as.numeric(Species) ~ Sepal.Length + Sepal.Width, iris)
predict(iris.lm, data.frame(Sepal.Length = 3, Sepal.Width = 2))

On 1/29/06, Michael <[hidden email]> wrote:

> Hi all,
>
> After trial and error by myself for a few hours, I decide to ask for your
> help.
>
> I have a training set which is a matrix of size 200 x 2, where the two
> columns denote each independent variable. I have 200 observations.
>
> -----------------
> ss=data.frame(trainingSet);
> result=lm(trainingClass~ss$X1+ss$X2);
> -----------------
>
> where trainingClass denotes the true classes of the training data.
>
> Now I want to apply the model to predict new data:
>
> -----------------
> > gg=predict(result, data.frame(X1=1, X2=2))
> Warning message:
> 'newdata' had 1 rows but variable(s) found have 200 rows
> -----------------
>
> That's to say, I provide a new data which is one observation of 2
> independent variables(1 row, two columns). I converted it into data frame.
>
> However, the R never gives me new predication value for this NEW ONE
> observation. Instead, it keeps giving me the above warning and keeps
> printing the fitted value for the 200 training samples...
>
> That's very bad.
>
> Please help me!
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: SoS! How to predict new values using linear regression models?

PIKAL Petr
Hi


On 29 Jan 2006 at 17:28, Gabor Grothendieck wrote:

Date sent:       Sun, 29 Jan 2006 17:28:29 -0500
From:           Gabor Grothendieck <[hidden email]>
To:             Michael <[hidden email]>
Copies to:       [hidden email]
Subject:         Re: [R] SoS! How to predict new values using linear regression
        models?

> Leaving aside the issue of whether linear regression is appropriate
> here, do it like this where I have used the builtin iris data frame
> since I don't have access to your ss:
>
> iris.lm <- lm(as.numeric(Species) ~ Sepal.Length + Sepal.Width, iris)
> predict(iris.lm, data.frame(Sepal.Length = 3, Sepal.Width = 2))
>
> On 1/29/06, Michael <[hidden email]> wrote:
> > Hi all,
> >
> > After trial and error by myself for a few hours, I decide to ask for
> > your help.
> >
> > I have a training set which is a matrix of size 200 x 2, where the
> > two columns denote each independent variable. I have 200
> > observations.
> >
> > -----------------
> > ss=data.frame(trainingSet);
> > result=lm(trainingClass~ss$X1+ss$X2);
                                          ^^^^    ^^^

As Gabor suggested, use data argument.

result=lm(trainingClass~X1+X2, data=ss)

and your predict shall work.

HTH
Petr

> > -----------------
> >
> > where trainingClass denotes the true classes of the training data.
> >
> > Now I want to apply the model to predict new data:
> >
> > -----------------
> > > gg=predict(result, data.frame(X1=1, X2=2))
> > Warning message:
> > 'newdata' had 1 rows but variable(s) found have 200 rows
> > -----------------
> >
> > That's to say, I provide a new data which is one observation of 2
> > independent variables(1 row, two columns). I converted it into data
> > frame.
> >
> > However, the R never gives me new predication value for this NEW ONE
> > observation. Instead, it keeps giving me the above warning and keeps
> > printing the fitted value for the 200 training samples...
> >
> > That's very bad.
> >
> > Please help me!
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

match() & seq()

Jacques VESLOT
sorry if it has already been discussed but i can't understand this:

 > seq(0.1,1,by=0.1)
 [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
 > match(0.1,seq(0.1,1,by=0.1))
[1] 1
 > match(0.2,seq(0.1,1,by=0.1))
[1] 2
 > match(0.3,seq(0.1,1,by=0.1))
[1] NA
 > match(0.4,seq(0.1,1,by=0.1))
[1] 4

 > R.version
         _            
platform i386-pc-mingw32
arch     i386          
os       mingw32      
system   i386, mingw32
status                
major    2            
minor    2.1          
year     2005          
month    12            
day      20            
svn rev  36812        
language R

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: match() & seq()

Liaw, Andy
Hope this is clear:

> x <- seq(0.1, 1, by=0.1)
> 0.3 == x[3]
[1] FALSE
> abs(0.3 - x[3])
[1] 5.551115e-17


Andy

From: Jacques VESLOT

>
> sorry if it has already been discussed but i can't understand this:
>
>  > seq(0.1,1,by=0.1)
>  [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
>  > match(0.1,seq(0.1,1,by=0.1))
> [1] 1
>  > match(0.2,seq(0.1,1,by=0.1))
> [1] 2
>  > match(0.3,seq(0.1,1,by=0.1))
> [1] NA
>  > match(0.4,seq(0.1,1,by=0.1))
> [1] 4
>
>  > R.version
>          _            
> platform i386-pc-mingw32
> arch     i386          
> os       mingw32      
> system   i386, mingw32
> status                
> major    2            
> minor    2.1          
> year     2005          
> month    12            
> day      20            
> svn rev  36812        
> language R
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: match() & seq()

Dimitris Rizopoulos
In reply to this post by Jacques VESLOT
the problem is in

0.3 == seq(0.1,1,by=0.1)[3]

versus

all.equal(0.3, seq(0.1,1,by=0.1)[3])

look at ?Comparison for more info.

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message -----
From: "Jacques VESLOT" <[hidden email]>
To: <[hidden email]>
Sent: Monday, January 30, 2006 3:19 PM
Subject: [R] match() & seq()


> sorry if it has already been discussed but i can't understand this:
>
> > seq(0.1,1,by=0.1)
> [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> > match(0.1,seq(0.1,1,by=0.1))
> [1] 1
> > match(0.2,seq(0.1,1,by=0.1))
> [1] 2
> > match(0.3,seq(0.1,1,by=0.1))
> [1] NA
> > match(0.4,seq(0.1,1,by=0.1))
> [1] 4
>
> > R.version
>         _
> platform i386-pc-mingw32
> arch     i386
> os       mingw32
> system   i386, mingw32
> status
> major    2
> minor    2.1
> year     2005
> month    12
> day      20
> svn rev  36812
> language R
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: match() & seq()

Roger Bivand
In reply to this post by Jacques VESLOT
On Mon, 30 Jan 2006, Jacques VESLOT wrote:

> sorry if it has already been discussed but i can't understand this:
>
>  > seq(0.1,1,by=0.1)
>  [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
>  > match(0.1,seq(0.1,1,by=0.1))
> [1] 1
>  > match(0.2,seq(0.1,1,by=0.1))
> [1] 2
>  > match(0.3,seq(0.1,1,by=0.1))
> [1] NA
>  > match(0.4,seq(0.1,1,by=0.1))
> [1] 4

FAQ 7.31
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

> print(seq(0.1,1,by=0.1), digits=20)
 [1] 0.10000000000000000555 0.20000000000000001110 0.30000000000000004441
 [4] 0.40000000000000002220 0.50000000000000000000 0.59999999999999997780
 [7] 0.70000000000000006661 0.80000000000000004441 0.90000000000000002220
[10] 1.00000000000000000000
> match(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9), seq(0.1,1,by=0.1))
[1]  1  2 NA  4  5  6 NA  8  9
> all.equal(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0), seq(0.1,1,by=0.1),
+ tolerance = .Machine$double.eps ^ 2)
[1] "Mean relative  difference: 1.665335e-16"
> all.equal(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0), seq(0.1,1,by=0.1))
[1] TRUE


>
>  > R.version
>          _            
> platform i386-pc-mingw32
> arch     i386          
> os       mingw32      
> system   i386, mingw32
> status                
> major    2            
> minor    2.1          
> year     2005          
> month    12            
> day      20            
> svn rev  36812        
> language R
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Roger Bivand
Department of Economics
NHH Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: match() & seq()

Jacques VESLOT
Thanks Roger, Andy and Dimitris...
though i am familiar with this behaviour **in some cases**, i couldn't
catch - yesterday evening - why it matched with 0.4, and not with 0.3;
of course these numbers are not integers ! but i believed match() deals
with such equalities.
i will have a look at the article mentionned in FAQ 7.31 since
everything is not clear for me yet .
 > identical(0.4-0.3,0.1)
[1] FALSE
 > all.equal(0.4-0.3,0.1)
[1] TRUE
 
jacques


Roger Bivand a écrit :

>On Mon, 30 Jan 2006, Jacques VESLOT wrote:
>
>  
>
>>sorry if it has already been discussed but i can't understand this:
>>
>> > seq(0.1,1,by=0.1)
>> [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
>> > match(0.1,seq(0.1,1,by=0.1))
>>[1] 1
>> > match(0.2,seq(0.1,1,by=0.1))
>>[1] 2
>> > match(0.3,seq(0.1,1,by=0.1))
>>[1] NA
>> > match(0.4,seq(0.1,1,by=0.1))
>>[1] 4
>>    
>>
>
>FAQ 7.31
>http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
>
>  
>
>>print(seq(0.1,1,by=0.1), digits=20)
>>    
>>
> [1] 0.10000000000000000555 0.20000000000000001110 0.30000000000000004441
> [4] 0.40000000000000002220 0.50000000000000000000 0.59999999999999997780
> [7] 0.70000000000000006661 0.80000000000000004441 0.90000000000000002220
>[10] 1.00000000000000000000
>  
>
>>match(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9), seq(0.1,1,by=0.1))
>>    
>>
>[1]  1  2 NA  4  5  6 NA  8  9
>  
>
>>all.equal(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0), seq(0.1,1,by=0.1),
>>    
>>
>+ tolerance = .Machine$double.eps ^ 2)
>[1] "Mean relative  difference: 1.665335e-16"
>  
>
>>all.equal(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0), seq(0.1,1,by=0.1))
>>    
>>
>[1] TRUE
>
>
>  
>
>> > R.version
>>         _            
>>platform i386-pc-mingw32
>>arch     i386          
>>os       mingw32      
>>system   i386, mingw32
>>status                
>>major    2            
>>minor    2.1          
>>year     2005          
>>month    12            
>>day      20            
>>svn rev  36812        
>>language R
>>
>>______________________________________________
>>[hidden email] mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>>    
>>
>
>  
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: match() & seq()

PIKAL Petr
Hi

?identical

The safe and reliable way to test two objects for being _exactly_
                                                                                                                        ^^^^^^^^^
     equal.  It returns 'TRUE' in this case, 'FALSE' in every other
     case.

?all.equal

     'all.equal(x,y)' is a utility to compare R objects 'x' and 'y'
     testing "near equality".  If they are different, comparison is
                        ^^^^^^^^^^^

     still made to some extent, and a report of the differences is
 
HTH
Petr





On 31 Jan 2006 at 11:21, Jacques VESLOT wrote:

Date sent:       Tue, 31 Jan 2006 11:21:20 +0400
From:           Jacques VESLOT <[hidden email]>
To:             [hidden email]
Copies to:       [hidden email]
Subject:         Re: [R] match() & seq()

> Thanks Roger, Andy and Dimitris...
> though i am familiar with this behaviour **in some cases**, i couldn't
> catch - yesterday evening - why it matched with 0.4, and not with 0.3;
> of course these numbers are not integers ! but i believed match()
> deals with such equalities. i will have a look at the article
> mentionned in FAQ 7.31 since everything is not clear for me yet .
>  > identical(0.4-0.3,0.1)
> [1] FALSE
>  > all.equal(0.4-0.3,0.1)
> [1] TRUE
>
> jacques
>
>
> Roger Bivand a écrit :
>
> >On Mon, 30 Jan 2006, Jacques VESLOT wrote:
> >
> >  
> >
> >>sorry if it has already been discussed but i can't understand this:
> >>
> >> > seq(0.1,1,by=0.1)
> >> [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> >> > match(0.1,seq(0.1,1,by=0.1))
> >>[1] 1
> >> > match(0.2,seq(0.1,1,by=0.1))
> >>[1] 2
> >> > match(0.3,seq(0.1,1,by=0.1))
> >>[1] NA
> >> > match(0.4,seq(0.1,1,by=0.1))
> >>[1] 4
> >>    
> >>
> >
> >FAQ 7.31
> >http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-
> >these-numbers-are-equal_003f
> >
> >  
> >
> >>print(seq(0.1,1,by=0.1), digits=20)
> >>    
> >>
> > [1] 0.10000000000000000555 0.20000000000000001110
> > 0.30000000000000004441 [4] 0.40000000000000002220
> > 0.50000000000000000000 0.59999999999999997780 [7]
> > 0.70000000000000006661 0.80000000000000004441 0.90000000000000002220
> >[10] 1.00000000000000000000
> >  
> >
> >>match(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9), seq(0.1,1,by=0.1))
> >>    
> >>
> >[1]  1  2 NA  4  5  6 NA  8  9
> >  
> >
> >>all.equal(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0),
> >>seq(0.1,1,by=0.1),
> >>    
> >>
> >+ tolerance = .Machine$double.eps ^ 2)
> >[1] "Mean relative  difference: 1.665335e-16"
> >  
> >
> >>all.equal(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0),
> >>seq(0.1,1,by=0.1))
> >>    
> >>
> >[1] TRUE
> >
> >
> >  
> >
> >> > R.version
> >>         _            
> >>platform i386-pc-mingw32
> >>arch     i386          
> >>os       mingw32      
> >>system   i386, mingw32
> >>status                
> >>major    2            
> >>minor    2.1          
> >>year     2005          
> >>month    12            
> >>day      20            
> >>svn rev  36812        
> >>language R
> >>
> >>______________________________________________
> >>[hidden email] mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide!
> >>http://www.R-project.org/posting-guide.html
> >>
> >>    
> >>
> >
> >  
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html