Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am learning about the "kohonen" package in R for the purpose of making Self Organizing Maps (SOM, also called Kohonen Networks - a type of Machine Learning algorithm). I am following this R language tutorial over here: https://www.rpubs.com/loveb/som

I tried to create my own data (this time with both "factor" and "numeric" variables) and run the SOM algorithm (this time using the "supersom()" function instead):

#load libraries and adjust colors

library(kohonen) #fitting SOMs
library(ggplot2) #plots
library(RColorBrewer) #colors, using predefined palettes

 
contrast <- c("#FA4925", "#22693E", "#D4D40F", "#2C4382", "#F0F0F0", "#3D3D3D") #my own, contrasting pairs
cols <- brewer.pal(10, "Paired")




#create and format data

a =rnorm(1000,10,10)
b = rnorm(1000,10,5)
c = rnorm(1000,5,5)
d = rnorm(1000,5,10)
e <- sample( LETTERS[1:4], 100 , replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) )
f <- sample( LETTERS[1:5], 100 , replace=TRUE, prob=c(0.2, 0.2, 0.2, 0.2, 0.2) )
g <- sample( LETTERS[1:2], 100 , replace=TRUE, prob=c(0.5, 0.5) )

data = data.frame(a,b,c,d,e,f,g)
data$e = as.factor(data$e)
data$f = as.factor(data$f)
data$g = as.factor(data$g)


cols <- 1:4
data[cols] <- scale(data[cols])

#som model
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"), 
                dist.fct = "euclidean", keep.data = TRUE)

From here, I was able to successfully make some of the basic plots:

    #plots

#pretty gradient colors

colour1 <- tricolor(som$grid)

colour4 <- tricolor(som$grid, phi = c(pi/8, 6, -pi/6), offset = 0.1)
    
plot(som, type="changes")
plot(som, type="count")
plot(som, type="quality", shape = "straight")
plot(som, type="dist.neighbours", palette.name=grey.colors, shape = "straight")

However, the problem arises when I try to make individual plots for each variable:

#error
var <- 1 #define the variable to plot
plot(som, type = "property", property = getCodes(som)[,var], main=colnames(getCodes(som))[var], palette.name=terrain.colors)


var <- 6 #define the variable to plot
plot(som, type = "property", property = getCodes(som)[,var], main=colnames(getCodes(som))[var], palette.name=terrain.colors)

This produces an error: "Error: Incorrect Number of Dimensions"

A similar error (NAs by coercion) is produced when attempting to cluster the SOM Network:

#cluster (error)

set.seed(33) #for reproducability
fit_kmeans <- kmeans(data, 3) #3 clusters are used, as indicated by the wss development.

cl_assignmentk <- fit_kmeans$cluster[data$unit.classif]
par(mfrow=c(1,1))
plot(som, type="mapping", bg = rgb(colour4), shape = "straight", border = "grey",col=contrast)
add.cluster.boundaries(som, fit_kmeans$cluster, lwd = 3, lty = 2, col=contrast[4])

Can someone please tell me what I am doing wrong? Thanks

Sources: https://www.rdocumentation.org/packages/kohonen/versions/2.0.5/topics/supersom

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
376 views
Welcome To Ask or Share your Answers For Others

1 Answer

getCodes() produces a list and as such you have to treat it like one.

Calling getCodes(som) produces a list containing 7 items named a-g as such you should be selecting items from the list either using $ or [[]]

e.g

plot(som, type = "property", property = getCodes(som)[[1]], main=names(getCodes(som))[1], palette.name=terrain.colors)

or

plot(som, type = "property", property = getCodes(som)$a, main="a", palette.name=terrain.colors)

or

plot(som, type = "property", property = getCodes(som)[["a"]], main="a", palette.name=terrain.colors)

if you must set the variable prior to calling the plot you can do so like:

var <- 1    
plot(som, type = "property", property = getCodes(som)[[var]], main=names(getCodes(som))[var], palette.name=terrain.colors)

Regarding kmeans()

kmeans() needs a matrix or an object that can be coerced into a matrix, you have factors (categorical data) which cannot be coerced into numeric, either drop the factors, or find another method.

drop the factors:

#cluster (error)  
set.seed(33) 
#for reproducability 

fit_kmeans <- kmeans(as.matrix(data[1:4]), 3)
#3 clusters are used, as indicated by the wss development. 

cl_assignmentk <- fit_kmeans$cluster[data$unit.classif] 
par(mfrow=c(1,1)) 
plot(som, type="mapping", bg = rgb(colour4), shape = "straight", border = "grey",col=contrast) 
add.cluster.boundaries(som, fit_kmeans$cluster, lwd = 3, lty = 2, col=contrast[4]) 

edit: Alternatively you can specify the code directly from getCodes() by using idx like so:

plot(som, type = "property", property = getCodes(som, idx = 1), main="a"), palette.name=terrain.colors)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...