Dear list users,

I am looking for a R package implementing a multinomial logistic

regression with fixed effects (Chamberlain 1980, Review of Economic

Studies 47: 225–238).

Over the years, a number of questions have been asked in the R help

and in stack-related websites in order to find how to use this model

in a fixed-effects framework.

In some cases, it was suggested to use existing routines, mainly

nnet::multinom and mlogit::mlogit. However, this doesn’t look like a

viable solution, because these packages are not exactly designed for

this task. Perhaps unsurprisingly, I was not able to find any working

example that can serve my purpose.

Others have suggested to use a Poisson transformation. This is

discussed for example in a working paper on the arxiv

https://arxiv.org/pdf/1707.08538.pdf. However, I found no useful

guides to implement this approach in R.

Finally, many have been addressed to packages using Bayesian

estimation strategies.

I was wondering if anyone in this list can provide an example or any

resource that I can use to begin working on this model using R.

Let me stress that I am not interested in working with Bayesian methods.

My data looks like this:

set.seed(123)

# number of observations

n <- 100

# number of possible choice

possible_choice <- letters[1:4]

# number of years

years <- 3

# individual characteristics

x1 <- runif(n * 3, 5.0, 70.5)

x2 <- sample(1:n^2, n * 3, replace = F)

# actual choice at time 1

actual_choice_year_1 <- possible_choice[sample(1:4, n, replace = T,

prob = rep(1/4, 4))]

actual_choice_year_2 <- possible_choice[sample(1:4, n, replace = T,

prob = c(0.4, 0.3, 0.2, 0.1))]

actual_choice_year_3 <- possible_choice[sample(1:4, n, replace = T,

prob = c(0.2, 0.5, 0.2, 0.1))]

# create long dataset

df <- data.frame(choice = c(actual_choice_year_1,

actual_choice_year_2, actual_choice_year_3),

x1 = x1, x2 = x2,

individual_fixed_effect = as.character(rep(1:n, years)),

time_fixed_effect = as.character(rep(1:years, each = n)),

stringsAsFactors = F)

Ideally, what I would like to estimate is a formula of the kind

formula("choice ~ x1 + x2 + individual_fixed_effect + time_fixed_effect")

To this purpose, I have tried to use the package mlogit. Consequenly,

following the vignette of this package I have rearranged my data as

library(mlogit)

# create wide dataset

data_mlogit <- mlogit.data(df, id.var = "individual_fixed_effect",

group.var = "time_fixed_effect",

choice = "choice",

shape = "wide")

This allow me to run a multinomial logit regression without fixed

effects by typing

# formula

formula_mlogit <- formula("choice ~ 1| x1 + x2")

# run multinomial regression

fit <- mlogit(formula_mlogit, data_mlogit)

summary(fit)

Apparently, in order to include fixed effects and use a panel

estimation strategy, one should set the argument panel equal to TRUE

in the function mlogit.

However, according to the help of this function, this argument is

evaluated only if another argument, rpar, is not NULL.

The argument rpar is used to set the distribution of random variables

in the model specification. Unfortunately, no random variables are

included in my specification, hence I have no parameters to use in

rpar. As a result, mlogit seems not the best choice in this context.

On stackexchange, a possible solution was proposed few years ago

https://stats.stackexchange.com/questions/51148/unable-to-provide-random-parameter-with-mlogitHowever, I don't understand how to actually implement it.

Regards,

Valerio Leone Sciabolazza, Ph.D.

Department of Business and Economics

University of Naples, Parthenope.

[hidden email]
www.valerioleonesciabolazza.com

P.s.

Recently, Stata provided a package (femlogit) for the estimation of

this model with fixed effects.

______________________________________________

[hidden email] mailing list -- To UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.