# Re: Error survreg: Density function returned an an invalid matrix

4 messages
Open this post in threaded view
|

## Re: Error survreg: Density function returned an an invalid matrix

 Thanks Terry but the error persists. See: > library(foreign)> library(survival)> library(VGAM) > mypareto <- list(name='Pareto',+                  init= function(x, weights,parms){+                    alpha <- length(x)/(sum(log(x)))#this is a MLE for alpha+                    c(media <-(alpha*1/(alpha-1)),varianza <- ((1/alpha)^2)*(alpha/(alpha-2)))},+                  density= function(x,weights) {+                    alpha <- length(x)/(sum(log(x)))+                    cdf1 <- function(x, alpha) ifelse(x > 1 , 1 - (1/x)**alpha, 0 )+                    cdf2 <- function(x, alpha) ifelse(x > 1, (1/x)**alpha ,0)+                    distribution <- function(x, alpha) ifelse(x > 1 , alpha/(x**(alpha+1)), 0)+                    firstdev <- function(x, alpha) ifelse(x > 1, -(alpha+x)/x, 0)+                    seconddev <- function(x, alpha) ifelse(x > 1, (alpha+1)*(alpha+2)/x^2,0)+                    cbind(cdf1(x,alpha),cdf2(x, alpha), distribution(x,alpha),firstdev(x,alpha),seconddev(x,alpha))},+                  devian!  ce=function(x) {stop('deviance residuals not defined')},+                  quantile= function(p, alpha) ifelse(p < 0 | p > 1, NaN, 1*(1-p)**(-1/alpha)))> > survregDtest(mypareto, TRUE)[1] TRUE> set.seed(1)> a <- rpareto(100, 1, 1) > b <- rnorm(100,5,1)> c <- rep(1,100)> base <- cbind.data.frame(a,b,c)> mod1<-survreg(Surv(a, c) ~ b, base, dist = mypareto)Error in survreg.fit(X, Y, weights, offset, init = init, controlvals = control,  :   Density function returned an invalid matrix 2015-11-04 7:52 GMT-06:00 Therneau, Terry M., Ph.D. <[hidden email]>: > Hi, I want to perform a survival analysis using survreg procedure from >> survival library in R for a pareto distribution for a time variable, so I >> set the new distribution using the following sintax: >> >>      library(foreign) >>      library(survival) >>      library(VGAM) >> >>      mypareto <- list(name='Pareto', >>                   init= function(x, weights,parms){ >> > etc. > > The survreg routine fits location-scale distributions such that (t(y) - > Xb)/s ~ F, where t is an optional transformation, F is some fixed > distribution and X is a matrix of covariates.  For any distribution the > questions to ask before trying to add the distribution to survreg are >   - can it be written in a location-scale form? >   - if so, how do the parameters of the distribution map to the location > (Xb) and scale (s). > > In fitting data we normally have per-subject location (X b) but an > intercept-only model is of course possible. > > If y is Weibull then log(y) fits into the framework, which is how survreg > fits it.  The transformation of parameters location and scale parameters > for log(y) back to the usual Weibull parameterization for y often trips > people up (see comments in the Examples section of ?survreg). > > The log of a Pareto can be written in this form (I think?).  The two > parameters are the scale a and lower limit b, with survival function of > S(x)= (b/x)^a, for x >= b.  If y = log(x) the survival function for y is > S(y) = (b/exp(y))^a = exp[-(y - log(b))/(1/a)], which has location log(b) > and scale 1/a. But even if I am correct the discontinuity at b will cause > the underlying Newton-Raphson method to fail. > >  Terry Therneau >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Error survreg: Density function returned an an invalid matrix

 The error message states that there is an invalid value for the density.  A long stretch of code is not very helpful in understanding this.  What we need are the definition of your density -- as it would be written in a textbook.  This formula needs to give a valid response for the range -infinity to +infinity.  Or more precisely, for any value that the maximizer might guess at some point during the iteration. Terry T. On 11/14/2015 05:00 AM, [hidden email] wrote: > Thanks Terry but the error persists. See: > >> >library(foreign)> library(survival)> library(VGAM) > mypareto <- list(name='Pareto',+                  init= remainder of message trucated ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
 I don´t know how to write a pareto distribution in that form, I want a pareto function for time because I have a time variable that fits that distribution. For a weibull and lognormal it is very easy because they are particular cases from a extreme value and gaussian distributions. I think it is possible to write a pareto using an exponential distribution,but I´m not sure, I tried this using : # OPTION 1. # Define one distribution in terms of another library(foreign) library(survival) library(VGAM) my.pareto <- survreg.distributions$exponential my.pareto$name <- "Pareto" my.pareto$scale <- NULL #Using the following transformation: my.pareto$dtrans <- function(y) min(y)*exp(y) survregDtest(my.pareto, TRUE) set.seed(1) a <- rpareto(100, 1, 6) b <- rnorm(100,5,1) c <- rep(1,100) base <- cbind.data.frame(a,b,c) mod1 <- survreg(Surv(a, c) ~ b, base, dist = my.pareto) summary(mod1) # It works but I don´t know if it's correct # OPTION 2. # Using Density, distribution function, quantile function and random generation for the Pareto(I) distribution # from VGAM package. my.pareto3 <- list(name='Pareto',                  init= function(x, weights,alpha,k){                    alpha <- length(x)/(sum(log(x))-length(x)*log(min(x)))                    k <-min(x)                    c(media <-(alpha*k/(alpha-1)),varianza <- ((k/alpha)^2)*(alpha/(alpha-2)))},                  density= function (x, alpha,k)  {                    alpha <- length(x)/(sum(log(x))-length(x)*log(min(x)))                    k <-min(x)                    pvec <- seq(0.1, 0.9, by = 0.1)                    qvec <- qpareto(pvec, alpha, k)                      cbind(ppareto(qvec, alpha, k),                            1-ppareto(qvec, alpha,  k),                            dpareto(x,  alpha,  k),                            -(alpha+x)/x,                            (alpha+1)*(alpha+2)/x^2)},                  deviance=function(x) {stop('deviance residuals not defined')},                  quantile= function(alpha,k) qpareto(seq(0.1, 0.9, by = 0.1), alpha, k)) survregDtest(my.pareto3, TRUE) mod3 <- survreg(Surv(a, c) ~ b, base, dist = my.pareto3) # Did not work and I don't want a fixed value for alpha and k paremeters but the function needs a default. # I got this error: Error in logdensity[xok] <- log(shape[xok]) + shape[xok] * log(scale[xok]) - : NAs are not allowed in subscripted assignments 6 dpareto(x, alpha, k) 5 cbind(ppareto(qvec, alpha, k), 1 - ppareto(qvec, alpha, k), dpareto(x, alpha, k), -(alpha + x)/x, (alpha + 1) * (alpha + 2)/x^2) 4 density(z, parms) 3 derfun(y, yy, exp(vars), sd\$density, parms) 2 survreg.fit(X, Y, weights, offset, init = init, controlvals = control, dist = dlist, scale = scale, nstrat = nstrata, strata, parms = parms) 1 survreg(Surv(a, c) ~ b, base, dist = my.pareto3) So, I don't know what else can I do. Thanks. 2015-11-16 11:38 GMT-06:00 Therneau, Terry M., Ph.D. <[hidden email]>: > You are still missing the point. > The survreg routine handles distribution of the form: > >   (t(y) - m)/s ~ f, where f is a distribution on the real line. > > Here t is an optional but fixed transform and m= X\beta.  Beta and s=scale > are the parameters that the routine will fit. > > For a log-normal, t=log and f= the density of a Gaussian mean=0, sd=1. > The distribution function is dnorm(x) > For a Weibull,    t=log and f= the density of the least extreme value > distribution: exp(-exp(x)) > > How do you write a Pareto in this form?  I assume that you would like > survreg to solve for some parameters -- how do you map them onto the beta > and s values that survreg will attempt to optimize?  I have not yet grasped > what it is that you want survreg to DO. > > Terry T. > > > > > > > > > On 11/16/2015 08:56 AM, Israel Ortiz wrote: > > Thanks Terry, I use the following formula for density: > [image: f_X(x)= \begin{cases} \frac{\alpha > x_\mathrm{m}^\alpha}{x^{\alpha+1}} & x \ge x_\mathrm{m}, \\ 0 & x < > x_\mathrm{m}. \end{cases}] > > Where *x*m is the minimum value for x. I get this fórmula in > https://en.wikipedia.org/wiki/Pareto_distribution but there are a lot of > books and sites that use the same fórmula. This part of the code use that > formula: > >  distribution <- function(x, alpha) ifelse(x > min(x) , > alpha*min(x)**alpha/(x**(alpha+1)), 0) > > Also, I support my sintax in the following post: > > > http://stats.stackexchange.com/questions/78168/how-to-know-if-my-data-fits-pareto-distribution> > Another option is transform my variable for time from pareto to > exponential (but this solution it's not very elegant): > > If X is pareto distributed then > [image: Y = \log\left(\frac{X}{x_\mathrm{m}}\right)] > > it's exponential distributed. > > The syntax: > > library(foreign) > library(survival) > library(VGAM) > > set.seed(3) > X <- rpareto(n=100, scale = 5,shape =  1) > > Y <- log(X/min(X)) > > hist(X,breaks=100) > hist(Y,breaks=100) > b <- rnorm(100,5,1) > c <- rep(1,100) > base <- cbind.data.frame(X,Y,b,c) > mod1<-survreg(Surv(Y+1, c) ~ b, base, dist = "exponential")# +1 it's > because time should be > 1 > > summary(mod1) > > This solution works but I don´t like it. > > Thanks. > > > > > 2015-11-16 7:40 GMT-06:00 Therneau, Terry M., Ph.D. <[hidden email]>: > >> The error message states that there is an invalid value for the density. >> A long stretch of code is not very helpful in understanding this.  What we >> need are the definition of your density -- as it would be written in a >> textbook.  This formula needs to give a valid response for the range >> -infinity to +infinity.  Or more precisely, for any value that the >> maximizer might guess at some point during the iteration. >> >> Terry T. >> >> >> On 11/14/2015 05:00 AM, [hidden email] wrote: >> >>> Thanks Terry but the error persists. See: >>> >>> >library(foreign)> library(survival)> library(VGAM) > mypareto <- >>>> list(name='Pareto',+                  init= >>>> >>> >> remainder of message trucated >> > > >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.