Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories


I am currently working a lot with Google Colab, and wanted to time the execution time of a small MLP for inference. However, when executing the below code in Google Colab, the reported runtime decreases with the number of executions of the notebook (i.e., repeatedly pressing the play-button after the notebook terminates).

I am getting values around

1st execution: 0.03   s 
2nd execution: 0.005  s 
3rd execution: 0.0007 s 

tested on different machines and with different browsers. Note that I'm aware that time.time() has a precision limit of 1ms on Unix systems, however, this does not explain the behaviour.

Is there some sort of caching going on on the GPU / in PyTorch? If so, why, and can I expect such a speed increase in the final, deployed application as well?

Code to replicate the behaviour:

import time
import torch 
import torch.nn.functional as F 

class Model(torch.nn.Module): 
  def __init__(self): 
    super(Model, self).__init__()

    self.fc1 = torch.nn.Linear(in_features=3, out_features=512)
    self.fc2 = torch.nn.Linear(in_features=512, out_features=512)
    self.fc3 = torch.nn.Linear(in_features=512, out_features=512)
    self.fc4 = torch.nn.Linear(in_features=512, out_features=512)
    self.fc5 = torch.nn.Linear(in_features=512, out_features=512)
    self.fc6 = torch.nn.Linear(in_features=512, out_features=512)
    self.fc7 = torch.nn.Linear(in_features=512, out_features=1)

  def forward(self, x): 
    in_dim = x.shape[0]
    x = F.relu(self.fc1(x.reshape(in_dim**2, 3)))
    x = F.relu(self.fc2(x))
    x = F.relu(self.fc3(x))
    x = F.relu(self.fc4(x))
    x = F.relu(self.fc5(x))
    x = F.relu(self.fc6(x))
    x = F.relu(self.fc7(x)).reshape((in_dim, in_dim, 1))
    return x 

device = 'cuda'
mlp = Model().to(device)

dim = 1024
input_tensor = torch.rand((dim, dim, 3), device=device)    

with torch.no_grad(): 
  start_time = time.time()
  out = mlp(input_tensor)
  end_time = time.time() 
  print("Model FW pass {}p: {} seconds".format(input_tensor.shape[0], 
                                               end_time-start_time))
question from:https://stackoverflow.com/questions/66049240/google-colab-mlp-execution-time-not-constant-but-decreasing

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.3k views
Welcome To Ask or Share your Answers For Others

1 Answer

Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...