Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

My dataset contains a column with names that I want to check in a for loop:

Name      Age

John      32
Luke      23
Christine  54
Mary      39
AnneMarie  42
Eoin      23

I would need to check them via a website which generates a pair ('name', score), where score is a number. This pair comes from the following code (it cannot work as it was extracted only for showing how I have got data that I would like in my dataframe)

for name in df['Name']: 

   # missing code
    for c in zip(names, scores):
        print(c)

For example, when name = John, c gives me the following output:

('Julie', 6.7)
('Michael', 3.4)
('John John', 3.1)
('Ludo', 3.0)
('Chris', 3.0)

when name = Luke, c gives me the following output:

('Mary', 2.7)
('Michael', 2.1)
('Bill', 3.5)
('Jess', 3.2)

and so on.

I would like to add this information in my dataframe in order to have something like this:

 Name      Age                  Friends                        Score
    
    John      32     [Julie, Michael, John John, Ludo, Chris]  [6.7, 3.4, 3.1, 3.0, 3.0]
    Luke      23     [Mary, Michael, Bill, Jess]               [2.7,2.1, 3.5, 3.2]
    Christine  54
    Mary      39
    AnneMarie  42         ....
    Eoin      23

I would appreciate your help on this, on how I can get a similar dataframe by using the results c for each name in the Name column.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
4.0k views
Welcome To Ask or Share your Answers For Others

1 Answer

Try:

# add index here
for idx,name in df['Name'].iteritems(): 

    # missing code
    for c in zip(names, scores):
         print(c)

    df.loc[idx, 'Friends'] = names
    df.loc[idx, 'Score'] = scores

Or you can better aggregate all the names and scores and assign once after the for loop:

# initialization
name_lists, score_lists = [], []

for name in df['Name']: 

    # missing code
    for c in zip(names, scores):
         print(c)

    name_lists.append(names)
    score_lists.append(scores)

# update the data frame
df['Friends'] = name_lists
df['Score'] = score_lists

The latter code is slightly faster than the first for not-so-big dataframes. For bigger dataframes, append repeatedly can be very slow.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...