I have two piece of codes below having same logic, curious to know which one is better of two and why?
1.
char_list = [('\\', '\\\\'), ('
', '\\n'), (''', '\\'')]
col_names = df.schema.names
df.select( *[func.regexp_replace(col_name, char_set[0], char_set[1]) for char_set in char_list for col_name in col_names])
char_list = [('\\', '\\\\'), ('
', '\\n'), (''', '\\'')]
col_names = df.schema.names
for char_set in char_list:
for col_name in col_names:
df = df.withColumn(col_name, func.regexp_replace(col_name, char_set[0], char_set[1]))
question from:https://stackoverflow.com/questions/65834589/how-does-for-loop-impact-performance-of-spark-code