Isolation forest using "solitude" package: help to predict

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Isolation forest using "solitude" package: help to predict

Johan Lassen
Dear community,

I would like to know if someone can help clarifying how to predict anomaly
scores on new data sets using the "solitude" package. A simple model can be
trained using:

library(solitude)
# Training the model:
iris_train <- iris[1:100, ]
model <- isolation_forest(iris_train[, 1:4], seed =
100,num.trees=100,importance="none")

# The anomaly scores of a new test data set can be calculated by
iris_test <- iris[100:150, ]
predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score")

#The challenge is how to predict the anomaly scores for a data set with
less observations than the #number of observations in the training data
set.
# Example: using a subset of just 11 observations as compared to the 51
observations results in anomaly scores that are smaller:

iris_test <- iris[100:110, ]
predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score")

Anyone knows how to predict "normalised (with respect to sample size)"
anomaly scores using the solitude package for R?

Thanks in advance!
Johan


--
Johan Lassen

"In the cities people live in time -
in the mountains people live in space" (Budistisk munk).

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.