Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a dictionary of dictionaries of the form:

{'user':{movie:rating} }

For example,

{Jill': {'Avenger: Age of Ultron': 7.0,
                            'Django Unchained': 6.5,
                            'Gone Girl': 9.0,
                            'Kill the Messenger': 8.0}
'Toby': {'Avenger: Age of Ultron': 8.5,
                                'Django Unchained': 9.0,
                                'Zoolander': 2.0}}

I want to convert this dict of dicts into a pandas dataframe with column 1 the user name and the other columns the movie ratings i.e.

user  Gone_Girl  Horrible_Bosses_2  Django_Unchained  Zoolander etc. 

However, some users did not rate the movies and so these movies are not included in the values() for that user key(). It would be nice in these cases to just fill the entry with NaN.

As of now, I iterate over the keys, fill a list, and then use this list to create a dataframe:

data=[] 
for i,key in enumerate(movie_user_preferences.keys() ):
    try:            
        data.append((key
                    ,movie_user_preferences[key]['Gone Girl']
                    ,movie_user_preferences[key]['Horrible Bosses 2']
                    ,movie_user_preferences[key]['Django Unchained']
                    ,movie_user_preferences[key]['Zoolander']
                    ,movie_user_preferences[key]['Avenger: Age of Ultron']
                    ,movie_user_preferences[key]['Kill the Messenger']))
    # if no entry, skip
    except:
        pass 
df=pd.DataFrame(data=data,columns=['user','Gone_Girl','Horrible_Bosses_2','Django_Unchained','Zoolander','Avenger_Age_of_Ultron','Kill_the_Messenger'])

But this only gives me a dataframe of users who rated all the movies in the set.

My goal is to append to the data list by iterating over the movie labels (rather than the brute force approach shown above) and, secondly, create a dataframe that includes all users and that places null values in the elements that do not have movie ratings.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
988 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can pass the dict of dict to the DataFrame constructor:

In [11]: d = {'Jill': {'Django Unchained': 6.5, 'Gone Girl': 9.0, 'Kill the Messenger': 8.0, 'Avenger: Age of Ultron': 7.0}, 'Toby': {'Django Unchained': 9.0, 'Zoolander': 2.0, 'Avenger: Age of Ultron': 8.5}}

In [12]: pd.DataFrame(d)
Out[12]:
                        Jill  Toby
Avenger: Age of Ultron   7.0   8.5
Django Unchained         6.5   9.0
Gone Girl                9.0   NaN
Kill the Messenger       8.0   NaN
Zoolander                NaN   2.0

Or use the from_dict method:

In [13]: pd.DataFrame.from_dict(d)
Out[13]:
                        Jill  Toby
Avenger: Age of Ultron   7.0   8.5
Django Unchained         6.5   9.0
Gone Girl                9.0   NaN
Kill the Messenger       8.0   NaN
Zoolander                NaN   2.0

In [14]: pd.DataFrame.from_dict(d, orient='index')
Out[14]:
      Django Unchained  Gone Girl  Kill the Messenger  Avenger: Age of Ultron  Zoolander
Jill               6.5          9                   8                     7.0        NaN
Toby               9.0        NaN                 NaN                     8.5          2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...