I have written a program to identify the poet who has written a given poem. To be more specific, I have a training set of 30000 lines of poem and 3 poets in general, and for each poet, there are 10000 lines (verses) of poem in my training set. Ok. I trained model, using backoff model :
That is, for any poet, first I count how many times a word is repeated in his poem in the training set, and then for each combination of two words, I also calculate their frequency. That is, given word a has appeared in the text, what is the probability of word b coming after? this is known at bigram. And finally, I modify the probability and compute P-hat for each poem and say that the test verse is for the poet who has the higher probability compared to the other two. The problem is that when I decided to tune hyperparameters, I came across this problem that when lamdas are kept steady and only epsilon is changed, the accuracy varies. But It should not, because since lamdas are not changed, changing epsilons only shifts all p-hats the same amount, and should not have any impact on accuracy. I assume that I have implemented the code right because I already have reached 85% accuracy which is good for this kind of model. This is the result of comparing three models, keeping lambdas constant but changing the epsilon :
question from:https://stackoverflow.com/questions/66067970/accuracy-decreases-when-i-increase-epsilon-in-backoff-nlp-model