I have some data of shape B*C*T*H*W
. I want to apply 2d convolutions on the H*W
dimension.
There are two options (that I see):
Apply partial 3D convolution with shape (1, 3, 3). 3D convolution accepts data with shape
B*C*T*H*W
which is exactly what I have. This is however a pseudo 3d conv that might be under-optimized (despite its heavy use in P3D networks).Transpose the data, apply 2D convolutions, and transpose the result back. This requires the overhead of data reshaping, but it makes use of the heavily optimized 2D convolutions.
data = raw_data.transpose(1,2).reshape(b*t, c, h, w).detach()
out = conv2d(data)
out = out.view(b, t, c, h, w).transpose(1, 2).contiguous()
Which one is faster?
(Note: I have a self-answer below. This aims to be a quick note for people who are googling, aka me 20 minutes ago)
question from:https://stackoverflow.com/questions/65930768/is-partial-3d-convolution-or-transpose2d-convolution-faster