

Hey all,
I seem to be having trouble fitting a spline to a large set of data using PSpline. It seems to work fine for a data set of size n=4476, but not for anything larger (say, n=4477). For example:
THIS WORKS:

random = array(0,c(4476,2))
random[,1] = runif(4476,0,1)
random[,2] = runif(4476,0,1)
random = random[order(random[,1]),]
plot(random[,1],random[,2])
fit2 = sm.spline(random[,1],random[,2], norder=2, cv=FALSE)
lines(fit2$x,fit2$y)
THIS FAILS:

random = array(0,c(4477,2))
random[,1] = runif(4477,0,1)
random[,2] = runif(4477,0,1)
random = random[order(random[,1]),]
plot(random[,1],random[,2])
fit2 = sm.spline(random[,1],random[,2], norder=2, cv=FALSE)
lines(fit2$x,fit2$y)
It gives:
Error in smooth.Pspline(x = ux, y = tmp[, 1], w = tmp[, 2], method = method, :
Singularity error in solving equations
Does anyone know if this is just a limitation, or am I missing something. The dataset I'd like to run it on contains n=6000. If it is a limitation, does anyone know of any other ways to do this that would accommodate a larger dataset (ideally with generalized cross validation)?
Thanks!
guy33


Use the smooth.spline() function in "stats" package. This is more stable.
?smooth.spline
Ravi.
________________________________________
From: [hidden email] [ [hidden email]] On Behalf Of guy33 [ [hidden email]]
Sent: Sunday, May 29, 2011 1:30 PM
To: [hidden email]
Subject: [R] Fitting spline using Pspline
Hey all,
I seem to be having trouble fitting a spline to a large set of data using
PSpline. It seems to work fine for a data set of size n=4476, but not for
anything larger (say, n=4477). For example:
THIS WORKS:

random = array(0,c(4476,2))
random[,1] = runif(4476,0,1)
random[,2] = runif(4476,0,1)
random = random[order(random[,1]),]
plot(random[,1],random[,2])
fit2 = sm.spline(random[,1],random[,2], norder=2, cv=FALSE)
lines(fit2$x,fit2$y)
THIS FAILS:

random = array(0,c(4477,2))
random[,1] = runif(4477,0,1)
random[,2] = runif(4477,0,1)
random = random[order(random[,1]),]
plot(random[,1],random[,2])
fit2 = sm.spline(random[,1],random[,2], norder=2, cv=FALSE)
lines(fit2$x,fit2$y)
It gives:
Error in smooth.Pspline(x = ux, y = tmp[, 1], w = tmp[, 2], method = method,
:
Singularity error in solving equations
Does anyone know if this is just a limitation, or am I missing something.
The dataset I'd like to run it on contains n=6000. If it is a limitation,
does anyone know of any other ways to do this that would accommodate a
larger dataset (ideally with generalized cross validation)?
Thanks!
guy33

View this message in context: http://r.789695.n4.nabble.com/FittingsplineusingPsplinetp3559202p3559202.htmlSent from the R help mailing list archive at Nabble.com.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Ravi,
Thanks so much! You're right, smooth.spline does work on larger n.
Although, for some reason it's results are different (slightly less good?, but I'm not sure). For example, on the simple doppler function below, sm.spline seems to be closer to the true function than smooth.spline:
x=array(0,1000)
y=array(0,1000)
for (i in 1:1000){
x[i] = i/1000
y[i] = (x[i]*(1x[i]))^.5 * sin(2*pi*(1.05/(x[i]+.05)))
}
plot(x,y)
fit = sm.spline(x, y, norder=2, cv=FALSE)
lines(fit$x,fit$y)
fit2 = smooth.spline(x, y, cv=FALSE)
lines(fit2$x,fit2$y)
What do you make of that?
guy33


Yes, you are right that the results of smooth.spline are slightly worse than that of sm.spline.
The Doppler function is "tricky". At small `x' values, it oscillates rapidly. Hence it is not surprising that the smoothers do not do as well.
Here is a noisy version of your Doppler function. I have also considered another smoother `glkerns'. As you can see, the smoothers do better for larger `x' than for small `x'. It is impossible to distinguish changes in function from noise.
require(pspline)
require(lokern)
x=array(0,1000)
y=array(0,1000)
for (i in 1:1000){
x[i] = i/1000
y[i] = (x[i]*(1x[i]))^.5 * sin(2*pi*(1.05/(x[i]+.05)))
}
y < y * (1 + rnorm(1000, 0, 0.2))
plot(x,y, cex=0.4, xlim=c(0,0.1))
fit = sm.spline(x, y, norder=2, cv=FALSE)
lines(fit$x,fit$y, col=2)
fit2 = smooth.spline(x, y, cv=FALSE)
lines(fit2$x,fit2$y, col=3)
fit3 = glkerns(x, y)
lines(fit3$x.out,fit3$est, col=4)
Ravi.
________________________________________
From: [hidden email] [ [hidden email]] On Behalf Of guy33 [ [hidden email]]
Sent: Sunday, May 29, 2011 6:28 PM
To: [hidden email]
Subject: Re: [R] Fitting spline using Pspline
Ravi,
Thanks so much! You're right, smooth.spline does work on larger n.
Although, for some reason it's results are different (slightly less good?,
but I'm not sure). For example, on the simple doppler function below,
sm.spline seems to be closer to the true function than smooth.spline:
x=array(0,1000)
y=array(0,1000)
for (i in 1:1000){
x[i] = i/1000
y[i] = (x[i]*(1x[i]))^.5 * sin(2*pi*(1.05/(x[i]+.05)))
}
plot(x,y)
fit = sm.spline(x, y, norder=2, cv=FALSE)
lines(fit$x,fit$y)
fit2 = smooth.spline(x, y, cv=FALSE)
lines(fit2$x,fit2$y)
What do you make of that?
guy33

View this message in context: http://r.789695.n4.nabble.com/FittingsplineusingPsplinetp3559202p3559610.htmlSent from the R help mailing list archive at Nabble.com.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

