Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I was playing with Mahout and found that the FileDataModel accepts data in the format

     userId,itemId,pref(long,long,Double).

I have some data which is of the format

     String,long,double 

What is the best/easiest method to work with this dataset on Mahout?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
263 views
Welcome To Ask or Share your Answers For Others

1 Answer

One way to do this is by creating an extension of FileDataModel. You'll need to override the readUserIDFromString(String value) method to use some kind of resolver do the conversion. You can use one of the implementations of IDMigrator, as Sean suggests.

For example, assuming you have an initialized MemoryIDMigrator, you could do this:

@Override
protected long readUserIDFromString(String stringID) {
    long result = memoryIDMigrator.toLongID(stringID); 
    memoryIDMigrator.storeMapping(result, stringID);
    return result;
}

This way you could use memoryIDMigrator to do the reverse mapping, too. If you don't need that, you can just hash it the way it's done in their implementation (it's in AbstractIDMigrator).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...