Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm trying to do some data work in Python pandas and having trouble writing out my results. I read my data in as a CSV file and been exporting each script as it's own CSV file which works fine. Lately though I've tried exporting everything in 1 Excel file with worksheets and a few of the sheets give me an error

"'utf8' codec can't decode byte 0xe9 in position 1: invalid continuation byte"

I have no idea how to even start finding any characters that could be causing problems exporting to Excel. Not sure why it exports to CSV just fine though :(

relevant lines

from pandas import ExcelWriter
data = pd.read_csv(input)
writer = ExcelWriter(output) #output is just the filename
fundraisers.to_excel(writer, "fundraisers")
locations.to_excel(writer, "locations") #error
locations.to_csv(outputcsv) #works
writer.save()

printing head of offending dataframe

Event ID    Constituent ID  Email Address   First Name     Last Name
f       1       A       A       1
F       4       L       R       C
M       1       1       A       D
F       4       A       A       G
M       2       0       R       G
M       3       O       O       H
M       2       T       E       H
M       2       A       A       H
M       2       M       M       K
F       3       J       E       K
Location ID raised  raised con  raised email
a   0   0   0
a   8   0   0
o   0   0   0
o   0   0   0
o   0   0   0
t   5   0   0
o   1   0   0
o   6   a   0
o   6   0   0
d   0   0   0

looking at the excel sheet I do actually get a partial print out. Anything in the first name column and beyond are blank, but event, constituent and email all print.

edit: Trying to read the csv in as utf8 fails, but reading it in as latin1 works. Is there a way to specify the to_excel encoding? Or decode and encode my dataframe to utf8?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.1k views
Welcome To Ask or Share your Answers For Others

1 Answer

Managed to solve this.

I made a function that goes through my columns that have strings and managed to decode/encode them into utf8 and it now works.

def changeencode(data, cols):
    for col in cols:
        data[col] = data[col].str.decode('iso-8859-1').str.encode('utf-8')
    return data   

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...