Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I've been struggling with converting scaled and centered model coefficients from a glmer model back to uncentered and unscaled values.

I analysed a dataset using GLMM in the lme4 (v1.1.7) package. It involves the calculation of maximum detection range of acoustic receivers and effect of environmental variables.

Sample data:

dd <-   structure(list(SUR.ID = c(10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 10186L, 
10186L, 10186L, 10186L, 10249L, 10249L, 10249L, 10249L, 10249L, 
10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 
10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 
10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 
10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 10249L, 
10249L, 10249L, 10249L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 10250L, 
10250L, 10250L, 10250L), Valid.detections = c(1L, 4L, 0L, 1L, 
6L, 7L, 0L, 1L, 0L, 0L, 6L, 5L, 3L, 5L, 0L, 0L, 1L, 0L, 0L, 0L, 
2L, 3L, 0L, 1L, 5L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 5L, 3L, 1L, 1L, 
0L, 0L, 5L, 8L, 0L, 1L, 0L, 0L, 3L, 7L, 1L, 2L, 7L, 0L, 7L, 6L, 
0L, 3L, 0L, 1L, 0L, 1L, 2L, 5L, 0L, 3L, 0L, 2L, 1L, 5L, 3L, 0L, 
0L, 2L, 0L, 0L, 0L, 0L, 0L, 3L, 4L, 0L, 2L, 2L, 0L, 3L, 0L, 0L, 
9L, 8L, 0L, 2L, 9L, 0L, 7L, 4L, 0L, 5L, 0L, 2L, 0L, 1L, 2L, 4L, 
3L, 2L, 1L, 1L, 3L, 4L, 1L, 2L, 1L, 3L, 0L, 0L, 0L, 6L, 0L, 5L, 
6L, 1L, 3L, 1L, 1L, 0L, 2L, 1L, 6L, 5L, 2L, 1L, 2L, 0L, 1L, 7L, 
5L, 4L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 4L, 2L, 6L, 0L, 0L, 
0L, 1L, 0L, 0L, 3L, 9L, 0L, 7L, 0L, 2L, 7L, 3L, 0L, 5L, 0L, 1L, 
1L, 9L, 2L, 9L, 1L, 0L, 6L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 3L, 13L, 
0L, 4L, 1L, 1L, 1L, 2L, 1L, 6L, 0L, 2L, 0L, 0L, 0L, 1L, 1L, 11L, 
5L, 0L, 6L, 5L), distance = c(200L, 200L, 200L, 200L, 100L, 100L, 
300L, 300L, 400L, 400L, 50L, 50L, 50L, 50L, 300L, 300L, 200L, 
200L, 400L, 400L, 200L, 200L, 100L, 100L, 100L, 100L, 300L, 300L, 
300L, 300L, 400L, 400L, 50L, 50L, 50L, 50L, 400L, 400L, 100L, 
100L, 200L, 200L, 200L, 200L, 100L, 100L, 100L, 100L, 50L, 300L, 
50L, 300L, 300L, 300L, 400L, 400L, 400L, 400L, 50L, 50L, 200L, 
200L, 200L, 100L, 200L, 100L, 100L, 100L, 300L, 300L, 400L, 400L, 
400L, 50L, 400L, 50L, 50L, 300L, 50L, 300L, 200L, 200L, 200L, 
200L, 100L, 100L, 100L, 100L, 50L, 300L, 50L, 300L, 300L, 300L, 
400L, 400L, 400L, 400L, 50L, 50L, 200L, 200L, 200L, 100L, 200L, 
100L, 100L, 100L, 300L, 300L, 400L, 400L, 400L, 50L, 400L, 50L, 
50L, 300L, 50L, 300L, 200L, 200L, 200L, 200L, 100L, 100L, 300L, 
300L, 400L, 400L, 50L, 50L, 50L, 50L, 300L, 300L, 200L, 200L, 
400L, 400L, 200L, 200L, 100L, 100L, 100L, 100L, 300L, 300L, 300L, 
300L, 400L, 400L, 50L, 50L, 50L, 50L, 400L, 400L, 100L, 100L, 
200L, 200L, 200L, 200L, 100L, 100L, 100L, 100L, 50L, 300L, 50L, 
300L, 300L, 300L, 400L, 400L, 400L, 400L, 50L, 50L, 200L, 200L, 
200L, 100L, 200L, 100L, 100L, 100L, 300L, 300L, 400L, 400L, 400L, 
50L, 400L, 50L, 50L, 300L, 50L, 300L), wind.speed = c(8.9939016, 
8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 
8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 
8.9939016, 8.9939016, 8.9939016, 10.8187512, 10.8187512, 8.9939016, 
8.9939016, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 
10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 
10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 
10.8187512, 8.9939016, 8.9939016, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 
8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 
8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 8.9939016, 
10.8187512, 10.8187512, 8.9939016, 8.9939016, 10.8187512, 10.8187512, 
10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 
10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 10.8187512, 
10.8187512, 10.8187512, 10.8187512, 10.8187512, 8.9939016, 8.9939016, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
2.389683519, 2.389683519, 2.389683519, 2.389683519, 2.389683519, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038, 
4.779367038, 4.779367038, 4.779367038, 4.779367038, 4.779367038
), receiver.depth = c(0.65, 0.65, 0.69, 0.69, 0.685, 0.685, 0.645, 
0.645, 0.645, 0.645, 0.67, 0.67, 0.665, 0.665, 0.705, 0.705, 
1.12, 1.12, 0.73, 0.73, 1.155, 1.155, 1.13, 1.13, 1.155, 1.155, 
1.105, 1.105, 1.155, 1.155, 1.095, 1.095, 1.145, 1.145, 1.14, 
1.14, 1.15, 1.15, 0.65, 0.65, 0.41, 0.41, 0.455, 0.455, 0.405, 
0.405, 0.49, 0.49, 0.415, 0.42, 0.415, 0.42, 0.45, 0.45, 0.43, 
0.43, 0.45, 0.45, 0.51, 0.51, 1.01, 1.01, 1.095, 1.045, 1.095, 
1.045, 1.09, 1.09, 1, 1, 0.975, 0.975, 1.08, 1.055, 1.08, 1.055, 
1.085, 1.095, 1.085, 1.095, 0.41, 0.41, 0.455, 0.455, 0.405, 
0.405, 0.49, 0.49, 0.415, 0.42, 0.415, 0.42, 0.45, 0.45, 0.43, 
0.43, 0.45, 0.45, 0.51, 0.51, 1.01, 1.01, 1.095, 1.045, 1.095, 
1.045, 1.09, 1.09, 1, 1, 0.975, 0.975, 1.08, 1.055, 1.08, 1.055, 
1.085, 1.095, 1.085, 1.095, 0.65, 0.65, 0.69, 0.69, 0.685, 0.685, 
0.645, 0.645, 0.645, 0.645, 0.67, 0.67, 0.665, 0.665, 0.705, 
0.705, 1.12, 1.12, 0.73, 0.73, 1.155, 1.155, 1.13, 1.13, 1.155, 
1.155, 1.105, 1.105, 1.155, 1.155, 1.095, 1.095, 1.145, 1.145, 
1.14, 1.14, 1.15, 1.15, 0.65, 0.65, 0.41, 0.41, 0.455, 0.455, 
0.405, 0.405, 0.49, 0.49, 0.415, 0.42, 0.415, 0.42, 0.45, 0.45, 
0.43, 0.43, 0.45, 0.45, 0.51, 0.51, 1.01, 1.01, 1.095, 1.045, 
1.095, 1.045, 1.09, 1.09, 1, 1, 0.975, 0.975, 1.08, 1.055, 1.08, 
1.055, 1.085, 1.095, 1.085, 1.095), water.temperature = c(20.33, 
20.33, 20.9, 20.9, 20.72, 20.72, 20.365, 20.365, 20.505, 20.505, 
20.445, 20.445, 20.62, 20.62, 20.88, 20.88, 22.775, 22.775, 20.92, 
20.92, 22.86, 22.86, 22.755, 22.755, 22.835, 22.835, 22.765, 
22.765, 22.86, 22.86, 22.78, 22.78, 22.835, 22.835, 22.78, 22.78, 
22.835, 22.835, 20.32, 20.32, 27.925, 27.925, 27.62, 27.62, 27.82, 
27.82, 27.58, 27.58, 27.67, 27.98, 27.67, 27.98, 27.63, 27.63, 
27.64, 27.64, 27.96, 27.96, 27.52, 27.52, 26.21, 26.21, 25.725, 
26.14, 25.725, 26.14, 25.605, 25.605, 26.205, 26.205, 26.255, 
26.255, 25.92, 26.07, 25.92, 26.07, 25.525, 25.795, 25.525, 25.795, 
27.925, 27.925, 27.62, 27.62, 27.82, 27.82, 27.58, 27.58, 27.67, 
27.98, 27.67, 27.98, 27.63, 27.63, 27.64, 27.64, 27.96, 27.96, 
27.52, 27.52, 26.21, 26.21, 25.725, 26.14, 25.725, 26.14, 25.605, 
25.605, 26.205, 26.205, 26.255, 26.255, 25.92, 26.07, 25.92, 
26.07, 25.525, 25.795, 25.525, 25.795, 20.33, 20.33, 20.9, 20.9, 
20.72, 20.72, 20.365, 20.365, 20.505, 20.505, 20.445, 20.445, 
20.62, 20.62, 20.88, 20.88, 22.775, 22.775, 20.92, 20.92, 22.86, 
22.86, 22.755, 22.755, 22.835, 22.835, 22.765, 22.765, 22.86, 
22.86, 22.78, 22.78, 22.835, 22.835, 22.78, 22.78, 22.835, 22.835, 
20.32, 20.32, 27.925, 27.925, 27.62, 27.62, 27.82, 27.82, 27.58, 
27.58, 27.67, 27.98, 27.67, 27.98, 27.63, 27.63, 27.64, 27.64, 
27.96, 27.96, 27.52, 27.52, 26.21, 26.21, 25.725, 26.14, 25.725, 
26.14, 25.605, 25.605, 26.205, 26.205, 26.255, 26.255, 25.92, 
26.07, 25.92, 26.07, 25.525, 25.795, 25.525, 25.795), Habitat = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Drug Channel", class = "factor"), 
    Distance = c(-0.078078746, -0.078078746, -0.078078746, -0.078078746, 
    -0.858866211, -0.858866211, 0.702708718, 0.702708718, 1.483496183, 
    1.483496183, -1.249259944, -1.249259944, -1.249259944, -1.249259944, 
    0.702708718, 0.702708718, -0.078078746, -0.078078746, 1.483496183, 
    1.483496183, -0.078078746, -0.078078746, -0.858866211, -0.858866211, 
    -0.858866211, -0.858866211, 0.702708718, 0.702708718, 0.702708718, 
    0.702708718, 1.483496183, 1.483496183, -1.249259944, -1.249259944, 
    -1.249259944, -1.249259944, 1.483496183, 1.483496183, -0.858866211, 
    -0.858866211, -0.078078746, -0.078078746, -0.078078746, -0.078078746, 
    -0.858866211, -0.858866211, -0.858866211, -0.858866211, -1.249259944, 
    0.702708718, -1.249259944, 0.702708718, 0.702708718, 0.702708718, 
    1.483496183, 1.483496183, 1.483496183, 1.483496183, -1.249259944, 
    -1.249259944, -0.078078746, -0.078078746, -0.078078746, -0.858866211, 
    -0.078078746, -0.858866211, -0.858866211, -0.858866211, 0.702708718, 
    0.702708718, 1.483496183, 1.483496183, 1.483496183, -1.249259944, 
    1.483496183, -1.249259944, -1.249259944, 0.702708718, -1.249259944, 
    0.702708718, -0.078078746, -0.078078746, -0.078078746, -0.078078746, 
    -0.858866211, -0.858866211, -0.858866211, -0.858866211, -1.249259944, 
    0.702708718, -1.249259944, 0.702708718, 0.702708718, 0.702708718, 
    1.483496183, 1.483496183, 1.483496183, 1.483496183, -1.249259944, 
    -1.249259944, -0.078078746, -0.0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
279 views
Welcome To Ask or Share your Answers For Others

1 Answer

Read in data:

source("SO_unscale.txt")

Separate unscaled and scaled variables (Transmitter.depth doesn't appear to have a scaled variant)

unsc.vars <- subset(dd,select=c(Transmitter.depth,
                       receiver.depth,water.temperature,
                       wind.speed,distance))
sc.vars <- subset(dd,select=c(Transmitter.depth,
                     Receiver.depth,Water.temperature,
                     Wind.speed,Distance))

I noticed that the means and standard deviations of the scaled variables were not exactly 0/1, perhaps because what's here is a subset of the data. In any case, we will need the means and standard deviations of the original data in order to unscale.

colMeans(sc.vars)
apply(sc.vars,2,sd)
cm <- colMeans(unsc.vars)
csd <- apply(unsc.vars,2,sd)

It is possible to 'unscale' even if the new variables are not exactly centered/scaled (one would just need to enter the actual amount of the shift/scaling done), but it's marginally more complicated, so I'm just going to go ahead and fit with precisely centered/scaled variables.

## changed data name to dd
library(lme4)
cs. <- function(x) scale(x,center=TRUE,scale=TRUE)
m1 <- glmer(Valid.detections ~ Transmitter.depth +
            receiver.depth + water.temperature + 
            wind.speed + distance + (distance | SUR.ID),
            data=dd, family = poisson,
            control=glmerControl(optimizer=c("bobyqa","Nelder_Mead")))
## FAILS with bobyqa alone
m1.sc <- glmer(Valid.detections ~ cs.(Transmitter.depth) +
               cs.(receiver.depth) + cs.(water.temperature) + 
               cs.(wind.speed) + cs.(distance) + (cs.(distance) | SUR.ID),
               data=dd, family = poisson,
               control=glmerControl(optimizer=c("bobyqa","Nelder_Mead")))

An important point is that in this case the very different scaling doesn't seem to do any harm; the scaled and unscaled model get essentially the same goodness of fit (if it were important, we would expect the scaled fit to do better)

logLik(m1)-logLik(m1.sc)  ## 1e-7

Here is the rescaling function given in a previous answer:

rescale.coefs <- function(beta,mu,sigma) {
    beta2 <- beta ## inherit names etc.
    beta2[-1] <- sigma[1]*beta[-1]/sigma[-1]
    beta2[1]  <- sigma[1]*beta[1]+mu[1]-sum(beta2[-1]*mu[-1])
    beta2
}     

The parameters do indeed match very closely. (The shifting/scaling vectors include possible scaling/shifting of the response variable, so we start with 0/1 since the response is not scaled [it would rarely make sense to scale a response variable for a GLMM, but this function can be useful for LMMs too].)

(cc <- rescale.coefs(fixef(m1.sc),mu=c(0,cm),sigma=c(1,csd)))
##            (Intercept) cs.(Transmitter.depth)    cs.(receiver.depth) 
##            3.865879406            0.011158402           -0.554392645 
## cs.(water.temperature)        cs.(wind.speed)          cs.(distance) 
##           -0.050833325           -0.042188495           -0.007231021 

fixef(m1)
##  (Intercept) Transmitter.depth    receiver.depth water.temperature 
##  3.865816422       0.011180213      -0.554498582      -0.050830611 
##   wind.speed          distance 
## -0.042179333      -0.007231004 

Since they're the same (since the unscaled model does fit OK), we could use either set for this calculation.

ddist <- 1:1000
vals <- cbind(`(Intercept)`=1,Transmitter.depth=0.6067926,
          Receiver.depth=-0.1610828,Water.temperature=-0.1128282,
          Wind.speed=-0.2959290,distance=ddist)
pred.obs <- exp(cc %*% t(vals))
max(ddist[pred.obs>1])

Now suppose you want to do similar scaling/unscaling for a model with interactions or other complexities (i.e. the predictor variables, the columns of the fixed-effect model matrix, are not the same as the input variables, which are the variables that appear in the formula)

m2 <- update(m1,. ~ . + wind.speed:distance)
m2.sc <- update(m1.sc,. ~ . + I(cs.(wind.speed*distance)))
logLik(m2)-logLik(m2.sc)

Calculate mean/sd of model matrix, dropping the first (intercept) value:

X <- getME(m2,"X")                                        
cm2 <- colMeans(X)[-1]
csd2 <- apply(X,2,sd)[-1]                                            
(cc2 <- rescale.coefs(fixef(m2.sc),mu=c(0,cm2),sigma=c(1,csd2)))
all.equal(unname(cc2),unname(fixef(m2)),tol=1e-3)  ## TRUE

You don't actually have to fit the full unscaled model just to get the scaling parameters: you could use model.matrix([formula],data) to derive the model matrix. That is, if you haven't already fitted m2 and you want to get X to get the column means and standard deviations, i.e.

X <- model.matrix(Valid.detections ~ Transmitter.depth + receiver.depth +
                      water.temperature + 
                      wind.speed + distance + 
                      wind.speed:distance,
                  data=dd)

If you have a LMM/have scaled the response variable, you should also multiply all of the standard deviations (including the residual error, sigma(fitted_model)) by the original SD of the response variable.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...