I have a dataframe with some 2 million rows like this:
dt num
0 2019-05-12 10:17:00 135
1 2018-01-16 21:32:00 5
2 2017-11-30 22:29:00 135
3 2017-10-05 16:59:00 19
4 2017-08-07 05:26:00 5
5 2017-06-12 17:47:00 18
For each and all of the different values in column 'num' I need to find the corresponding minimum value of column 'dt'.
I am doing it with a list comprehension with a mask followed by an operator:
[(num_i, df[df.num == num_i].dt.min()) for num_i in set(df.num)]
It works, but it is taking really a lot ot time. Any other way to solve it that is less time consuming?
Ooops ... thanks to all! (@It_is_Chris, @papke, @paul-brennan). I was thinking in making a time comparison, but the solution provided (groupby) solves it in seconds against close to one hour...