FW: problem with markov random field smooths in mgcv

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

FW: problem with markov random field smooths in mgcv

chris wilcox
Hi all,
   
    I am trying to fit a model with a markov random field smooth in mgcv.  I am having some trouble with getting it to run, and in particular I am getting the message
   
    Error in initial.sp(w * x, S, off) : S[[1]] matrix is not +ve definite.
   
    After reading everything I could find on mrf, it sounds like there was a bug that was brought up with Simon Wood in 2012, due to differences between windows and linux, with the linus machine stopping due to this error, while windows was not.  I have not been able to find much else on it.  Any suggestions would be much appreciated.
   
    There is reproducible code below.
   
    Thanks
   
    Chris
   
   
    library(mgcv)
   
    #create data
    Country <- as.factor(c("Australia","Australia","Australia","Australia","Australia","Australia","Bangladesh","Bangladesh","Bangladesh",
    "Bangladesh","Bangladesh","Bangladesh","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia",
    "China","China","China","China","China","China","East Timor","East Timor","East Timor",
    "East Timor","East Timor","East Timor","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1",
    "HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","China","China","China","China","China","China",
    "India","India","India","India","India","India","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia",
    "Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Myanmar","Myanmar","Myanmar","Myanmar","Myanmar",
    "Myanmar","Philippines","Philippines","Philippines","Philippines","Philippines","Philippines","South Korea","South Korea",
    "South Korea","South Korea","South Korea","South Korea","China","China","China","China","China","China",
    "Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Taiwan","Taiwan","Taiwan","Taiwan",
    "Taiwan","Taiwan","Thailand","Thailand","Thailand","Thailand","Thailand","Thailand","Vietnam","Vietnam","Vietnam","Vietnam",
    "Vietnam","Vietnam"))
   
    Count <- c(0,0,3,5,1,5,0,0,0,0,0,1,0,0,0,0,0,3,0,0,2,1,0,6,0,0,0,1,0,0,0,1,0,0,0,0
    ,0,0,20,0,1,0,0,0,0,0,0,2,0,0,6,3,3,10,1,1,18,11,8,11,0,1,2,2,1,14,0,0,0,1,0,0
    ,0,0,4,3,9,16,0,0,3,0,0,1,0,0,1,0,0,0,0,0,33,18,8,16,0,0,0,0,0,2,0,1,14,6,8,2
    ,0,0,0,0,1,1)
   
    Data <- data.frame(Count,Country)
   
    #create neighbour matrix
    NB <- list()
    NB$'East Timor' <- c(1,2,15)
    NB$Australia <- c(1,2,15)
    NB$'Sri Lanka' <-c(3,12,16)
    NB$Bangladesh <-c(4,12,13,16)
    NB$Philippines <- c(5,6,11,14,15,17)
    NB$Taiwan <- c(5,6,11)
    NB$Thailand <- c(7,8,10,12,13,14,15)
    NB$Vietnam <- c(7,8,10,11,14,15)
    NB$`South Korea` <- c(9,11)
    NB$Cambodia <- c(7,8,10)
    NB$China <- c(5,6,8,9,11,14)
    NB$India <- c(3,4,7,12,13,15,16)
    NB$Myanmar <- c(4,7,12,13,16)
    NB$Malaysia <- c(5,7,8,11,14,15)
    NB$Indonesia <- c(1,2,5,7,8,12,14,15)
    NB$HighSeas2 <- c(3,4,12,13,16)
    NB$HighSeas1 <- c(5,17)
   
    #check levels and names match
    all.equal(sort(names(NB)), sort(levels(Data$Country)))
   
    #try fitting GAM
    m1 <- gam(Data$Count ~ s(Data$Country, bs = 'mrf', xt = list(nb = NB)))
   
   

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: FW: problem with markov random field smooths in mgcv

David Winsemius

On 3/18/20 12:44 AM, Wilcox, Chris (O&A, Hobart) wrote:

> Hi all,
>      
>      I am trying to fit a model with a markov random field smooth in mgcv.  I am having some trouble with getting it to run, and in particular I am getting the message
>      
>      Error in initial.sp(w * x, S, off) : S[[1]] matrix is not +ve definite.
>      
>      After reading everything I could find on mrf, it sounds like there was a bug that was brought up with Simon Wood in 2012, due to differences between windows and linux, with the linus machine stopping due to this error, while windows was not.  I have not been able to find much else on it.  Any suggestions would be much appreciated.
>      
>      There is reproducible code below.
>      
>      Thanks
>      
>      Chris
>      
>      
>      library(mgcv)
>      
>      #create data
>      Country <- as.factor(c("Australia","Australia","Australia","Australia","Australia","Australia","Bangladesh","Bangladesh","Bangladesh",
>      "Bangladesh","Bangladesh","Bangladesh","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia",
>      "China","China","China","China","China","China","East Timor","East Timor","East Timor",
>      "East Timor","East Timor","East Timor","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1",
>      "HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","China","China","China","China","China","China",
>      "India","India","India","India","India","India","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia",
>      "Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Myanmar","Myanmar","Myanmar","Myanmar","Myanmar",
>      "Myanmar","Philippines","Philippines","Philippines","Philippines","Philippines","Philippines","South Korea","South Korea",
>      "South Korea","South Korea","South Korea","South Korea","China","China","China","China","China","China",
>      "Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Taiwan","Taiwan","Taiwan","Taiwan",
>      "Taiwan","Taiwan","Thailand","Thailand","Thailand","Thailand","Thailand","Thailand","Vietnam","Vietnam","Vietnam","Vietnam",
>      "Vietnam","Vietnam"))
>      
>      Count <- c(0,0,3,5,1,5,0,0,0,0,0,1,0,0,0,0,0,3,0,0,2,1,0,6,0,0,0,1,0,0,0,1,0,0,0,0
>      ,0,0,20,0,1,0,0,0,0,0,0,2,0,0,6,3,3,10,1,1,18,11,8,11,0,1,2,2,1,14,0,0,0,1,0,0
>      ,0,0,4,3,9,16,0,0,3,0,0,1,0,0,1,0,0,0,0,0,33,18,8,16,0,0,0,0,0,2,0,1,14,6,8,2
>      ,0,0,0,0,1,1)
>      
>      Data <- data.frame(Count,Country)


I'm not seeing any spatial data being defined, so I'm puzzled by the
expectation that this is yet a markov random field problem. You appear
to be following the last part of the example code in
?smooth.construct.mrf.smooth.spec {mgcv} without constructing your data
set to match the structure of the `columb` example dataset.

str(columb)

#------------------

'data.frame': 49 obs. of  8 variables:
  $ area      : num  0.3094 0.2593 0.1925 0.0838 0.4889 ...
  $ home.value: num  80.5 44.6 26.4 33.2 23.2 ...
  $ income    : num  19.53 21.23 15.96 4.48 11.25 ...
  $ crime     : num  15.7 18.8 30.6 32.4 50.7 ...
  $ open.space: num  2.851 5.297 4.535 0.394 0.406 ...
  $ district  : Factor w/ 49 levels "0","1","2","3",..: 1 2 3 4 5 6 7 8
9 10 ...
  $ x         : num  8.83 8.33 9.01 8.46 9.01 ...
  $ y         : num  14.4 14 13.8 13.7 13.3 ...


You are also committing a common R-beginner error in accessing columns
of a data object directly in a formula while failing to use a data
argument for a regression call.

--

David.

>      
>      #create neighbour matrix
>      NB <- list()
>      NB$'East Timor' <- c(1,2,15)
>      NB$Australia <- c(1,2,15)
>      NB$'Sri Lanka' <-c(3,12,16)
>      NB$Bangladesh <-c(4,12,13,16)
>      NB$Philippines <- c(5,6,11,14,15,17)
>      NB$Taiwan <- c(5,6,11)
>      NB$Thailand <- c(7,8,10,12,13,14,15)
>      NB$Vietnam <- c(7,8,10,11,14,15)
>      NB$`South Korea` <- c(9,11)
>      NB$Cambodia <- c(7,8,10)
>      NB$China <- c(5,6,8,9,11,14)
>      NB$India <- c(3,4,7,12,13,15,16)
>      NB$Myanmar <- c(4,7,12,13,16)
>      NB$Malaysia <- c(5,7,8,11,14,15)
>      NB$Indonesia <- c(1,2,5,7,8,12,14,15)
>      NB$HighSeas2 <- c(3,4,12,13,16)
>      NB$HighSeas1 <- c(5,17)
>      
>      #check levels and names match
>      all.equal(sort(names(NB)), sort(levels(Data$Country)))
>      
>      #try fitting GAM
>      m1 <- gam(Data$Count ~ s(Data$Country, bs = 'mrf', xt = list(nb = NB)))
>      
>      
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: FW: problem with markov random field smooths in mgcv

chris wilcox
Hi David,

Thanks for the comments.  I am running the analysis on a mac OS 10.12.6, using R R 3.5.3 GUI 1.70 El Capitan build (7632), and mgcv 1.8-31.  

I am aware of the procedure of using data = "xx" in a call to gam.  I am having a strange issue locally, in that gam does not see to be able to find the response variable (Count) if I use the standard way of handing gam the data.  That is the reason I wrote the gam function as I did.  I wanted to ignore that issue for the moment.  I have not had it happen with mgcv before, despite having used it for 10 years or so.

In reference to your question about spatial data, see the section of the code that starts with #create neighbourhood matrix.  This should say list, not matrix.  But it is the indices of the levels in the Country variable that are neighbours.  This list defines the parameter nb in the call to the function that constructs the mrf.

Thanks for spending the time to look at the problem.  I did not use the columb structure, but if you read down in the help for the smooth constructor function you can see that Simon has an example using nb that is like the one I submitted.

Cheers,

Chris

On 19/3/20, 3:41 am, "David Winsemius" <[hidden email]> wrote:

   
    On 3/18/20 12:44 AM, Wilcox, Chris (O&A, Hobart) wrote:
    > Hi all,
    >      
    >      I am trying to fit a model with a markov random field smooth in mgcv.  I am having some trouble with getting it to run, and in particular I am getting the message
    >      
    >      Error in initial.sp(w * x, S, off) : S[[1]] matrix is not +ve definite.
    >      
    >      After reading everything I could find on mrf, it sounds like there was a bug that was brought up with Simon Wood in 2012, due to differences between windows and linux, with the linus machine stopping due to this error, while windows was not.  I have not been able to find much else on it.  Any suggestions would be much appreciated.
    >      
    >      There is reproducible code below.
    >      
    >      Thanks
    >      
    >      Chris
    >      
    >      
    >      library(mgcv)
    >      
    >      #create data
    >      Country <- as.factor(c("Australia","Australia","Australia","Australia","Australia","Australia","Bangladesh","Bangladesh","Bangladesh",
    >      "Bangladesh","Bangladesh","Bangladesh","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia",
    >      "China","China","China","China","China","China","East Timor","East Timor","East Timor",
    >      "East Timor","East Timor","East Timor","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1",
    >      "HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","China","China","China","China","China","China",
    >      "India","India","India","India","India","India","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia",
    >      "Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Myanmar","Myanmar","Myanmar","Myanmar","Myanmar",
    >      "Myanmar","Philippines","Philippines","Philippines","Philippines","Philippines","Philippines","South Korea","South Korea",
    >      "South Korea","South Korea","South Korea","South Korea","China","China","China","China","China","China",
    >      "Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Taiwan","Taiwan","Taiwan","Taiwan",
    >      "Taiwan","Taiwan","Thailand","Thailand","Thailand","Thailand","Thailand","Thailand","Vietnam","Vietnam","Vietnam","Vietnam",
    >      "Vietnam","Vietnam"))
    >      
    >      Count <- c(0,0,3,5,1,5,0,0,0,0,0,1,0,0,0,0,0,3,0,0,2,1,0,6,0,0,0,1,0,0,0,1,0,0,0,0
    >      ,0,0,20,0,1,0,0,0,0,0,0,2,0,0,6,3,3,10,1,1,18,11,8,11,0,1,2,2,1,14,0,0,0,1,0,0
    >      ,0,0,4,3,9,16,0,0,3,0,0,1,0,0,1,0,0,0,0,0,33,18,8,16,0,0,0,0,0,2,0,1,14,6,8,2
    >      ,0,0,0,0,1,1)
    >      
    >      Data <- data.frame(Count,Country)
   
   
    I'm not seeing any spatial data being defined, so I'm puzzled by the
    expectation that this is yet a markov random field problem. You appear
    to be following the last part of the example code in
    ?smooth.construct.mrf.smooth.spec {mgcv} without constructing your data
    set to match the structure of the `columb` example dataset.
   
    str(columb)
   
    #------------------
   
    'data.frame': 49 obs. of  8 variables:
      $ area      : num  0.3094 0.2593 0.1925 0.0838 0.4889 ...
      $ home.value: num  80.5 44.6 26.4 33.2 23.2 ...
      $ income    : num  19.53 21.23 15.96 4.48 11.25 ...
      $ crime     : num  15.7 18.8 30.6 32.4 50.7 ...
      $ open.space: num  2.851 5.297 4.535 0.394 0.406 ...
      $ district  : Factor w/ 49 levels "0","1","2","3",..: 1 2 3 4 5 6 7 8
    9 10 ...
      $ x         : num  8.83 8.33 9.01 8.46 9.01 ...
      $ y         : num  14.4 14 13.8 13.7 13.3 ...
   
   
    You are also committing a common R-beginner error in accessing columns
    of a data object directly in a formula while failing to use a data
    argument for a regression call.
   
    --
   
    David.
   
    >      
    >      #create neighbour matrix
    >      NB <- list()
    >      NB$'East Timor' <- c(1,2,15)
    >      NB$Australia <- c(1,2,15)
    >      NB$'Sri Lanka' <-c(3,12,16)
    >      NB$Bangladesh <-c(4,12,13,16)
    >      NB$Philippines <- c(5,6,11,14,15,17)
    >      NB$Taiwan <- c(5,6,11)
    >      NB$Thailand <- c(7,8,10,12,13,14,15)
    >      NB$Vietnam <- c(7,8,10,11,14,15)
    >      NB$`South Korea` <- c(9,11)
    >      NB$Cambodia <- c(7,8,10)
    >      NB$China <- c(5,6,8,9,11,14)
    >      NB$India <- c(3,4,7,12,13,15,16)
    >      NB$Myanmar <- c(4,7,12,13,16)
    >      NB$Malaysia <- c(5,7,8,11,14,15)
    >      NB$Indonesia <- c(1,2,5,7,8,12,14,15)
    >      NB$HighSeas2 <- c(3,4,12,13,16)
    >      NB$HighSeas1 <- c(5,17)
    >      
    >      #check levels and names match
    >      all.equal(sort(names(NB)), sort(levels(Data$Country)))
    >      
    >      #try fitting GAM
    >      m1 <- gam(Data$Count ~ s(Data$Country, bs = 'mrf', xt = list(nb = NB)))
    >      
    >      
    >
    > ______________________________________________
    > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.
   

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: FW: problem with markov random field smooths in mgcv

Simon Wood
In reply to this post by chris wilcox
Hi Chris,

It's kind of a documentation glitch, a node is not supposed to be listed
as its own neighbour (it causes the diagonal entries in the penalty
matrix to be over-written by the wrong value). i.e. the neighbour list
should be.

  NB <- list()
     NB$'East Timor' <- c(2,15)
     NB$Australia <- c(1,15)
     NB$'Sri Lanka' <-c(12,16)
     NB$Bangladesh <-c(12,13,16)
     NB$Philippines <- c(6,11,14,15,17)
     NB$Taiwan <- c(5,11)
     NB$Thailand <- c(8,10,12,13,14,15)
     NB$Vietnam <- c(7,10,11,14,15)
     NB$`South Korea` <- c(11)
     NB$Cambodia <- c(7,8)
     NB$China <- c(5,6,8,9,14)
     NB$India <- c(3,4,7,13,15,16)
     NB$Myanmar <- c(4,7,12,16)
     NB$Malaysia <- c(5,7,8,11,15)
     NB$Indonesia <- c(1,2,5,7,8,12,14)
     NB$HighSeas2 <- c(3,4,12,13)
     NB$HighSeas1 <- c(5)

I've fixed the help page and had the smooth constructor ignore
auto-neighbours for the next release.

best,
Simon

   On 18/03/2020 07:44, Wilcox, Chris (O&A, Hobart) wrote:

> Hi all,
>      
>      I am trying to fit a model with a markov random field smooth in mgcv.  I am having some trouble with getting it to run, and in particular I am getting the message
>      
>      Error in initial.sp(w * x, S, off) : S[[1]] matrix is not +ve definite.
>      
>      After reading everything I could find on mrf, it sounds like there was a bug that was brought up with Simon Wood in 2012, due to differences between windows and linux, with the linus machine stopping due to this error, while windows was not.  I have not been able to find much else on it.  Any suggestions would be much appreciated.
>      
>      There is reproducible code below.
>      
>      Thanks
>      
>      Chris
>      
>      
>      library(mgcv)
>      
>      #create data
>      Country <- as.factor(c("Australia","Australia","Australia","Australia","Australia","Australia","Bangladesh","Bangladesh","Bangladesh",
>      "Bangladesh","Bangladesh","Bangladesh","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia","Cambodia",
>      "China","China","China","China","China","China","East Timor","East Timor","East Timor",
>      "East Timor","East Timor","East Timor","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1","HighSeas1",
>      "HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","HighSeas2","China","China","China","China","China","China",
>      "India","India","India","India","India","India","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia","Indonesia",
>      "Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Malaysia","Myanmar","Myanmar","Myanmar","Myanmar","Myanmar",
>      "Myanmar","Philippines","Philippines","Philippines","Philippines","Philippines","Philippines","South Korea","South Korea",
>      "South Korea","South Korea","South Korea","South Korea","China","China","China","China","China","China",
>      "Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Sri Lanka","Taiwan","Taiwan","Taiwan","Taiwan",
>      "Taiwan","Taiwan","Thailand","Thailand","Thailand","Thailand","Thailand","Thailand","Vietnam","Vietnam","Vietnam","Vietnam",
>      "Vietnam","Vietnam"))
>      
>      Count <- c(0,0,3,5,1,5,0,0,0,0,0,1,0,0,0,0,0,3,0,0,2,1,0,6,0,0,0,1,0,0,0,1,0,0,0,0
>      ,0,0,20,0,1,0,0,0,0,0,0,2,0,0,6,3,3,10,1,1,18,11,8,11,0,1,2,2,1,14,0,0,0,1,0,0
>      ,0,0,4,3,9,16,0,0,3,0,0,1,0,0,1,0,0,0,0,0,33,18,8,16,0,0,0,0,0,2,0,1,14,6,8,2
>      ,0,0,0,0,1,1)
>      
>      Data <- data.frame(Count,Country)
>      
>      #create neighbour matrix
>      NB <- list()
>      NB$'East Timor' <- c(1,2,15)
>      NB$Australia <- c(1,2,15)
>      NB$'Sri Lanka' <-c(3,12,16)
>      NB$Bangladesh <-c(4,12,13,16)
>      NB$Philippines <- c(5,6,11,14,15,17)
>      NB$Taiwan <- c(5,6,11)
>      NB$Thailand <- c(7,8,10,12,13,14,15)
>      NB$Vietnam <- c(7,8,10,11,14,15)
>      NB$`South Korea` <- c(9,11)
>      NB$Cambodia <- c(7,8,10)
>      NB$China <- c(5,6,8,9,11,14)
>      NB$India <- c(3,4,7,12,13,15,16)
>      NB$Myanmar <- c(4,7,12,13,16)
>      NB$Malaysia <- c(5,7,8,11,14,15)
>      NB$Indonesia <- c(1,2,5,7,8,12,14,15)
>      NB$HighSeas2 <- c(3,4,12,13,16)
>      NB$HighSeas1 <- c(5,17)
>      
>      #check levels and names match
>      all.equal(sort(names(NB)), sort(levels(Data$Country)))
>      
>      #try fitting GAM
>      m1 <- gam(Data$Count ~ s(Data$Country, bs = 'mrf', xt = list(nb = NB)))
>      
>      
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Simon Wood, School of Mathematics, University of Bristol, BS8 1TW UK
https://people.maths.bris.ac.uk/~sw15190/

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.