Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

How do I find the cosine similarity between vectors?

I need to find the similarity to measure the relatedness between two lines of text.

For example, I have two sentences like:

system for user interface

user interface machine

…?and their respective vectors after tF-idf, followed by normalisation using LSI, for example [1,0.5] and [0.5,1].

How do I measure the smiliarity between these vectors?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
608 views
Welcome To Ask or Share your Answers For Others

1 Answer

If you want to avoid relying on third-party libraries for such a simple task, here is a plain Java implementation:

public static double cosineSimilarity(double[] vectorA, double[] vectorB) {
    double dotProduct = 0.0;
    double normA = 0.0;
    double normB = 0.0;
    for (int i = 0; i < vectorA.length; i++) {
        dotProduct += vectorA[i] * vectorB[i];
        normA += Math.pow(vectorA[i], 2);
        normB += Math.pow(vectorB[i], 2);
    }   
    return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}

Note that the function assumes that the two vectors have the same length. You may want to explictly check it for safety.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...