S1: Introduction for components of Deep Learning models

It is widely accepted that a Deep Learning (DL) model is essentially a piece of code (for PyTorch, it is a Python class inherited from torch.nn.Module).  A complete DL model test-case that can be executed by DL frameworks (e.g., PyTorch) consists of three key components (here we only discuss the simplest case because the real-world model is quite large, especially Large Models) :

  1. A Python class inherited from torch.nn.Module
  2. Model initialization
  3. Tensor inputs

Here is a specific model corresponding to these three components.

				
					import torch


# A Python class inherited from `torch.nn.Module`
class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = torch.nn.Conv1d(in_channels=1, out_channels=3, kernel_size=1)
        self.linear = torch.nn.Linear(3, 1)

    def forward(self, x):
        x = x.unsqueeze(1)  # tensor shape: (1,3) -> (1,1,3)
        x = self.conv(x)  # tensor shape: (1,1,3) -> (1,3,3)
        x = x.mean(dim=-1)  # tensor shape: (1,3,3) -> (1,3)
        return self.linear(x)  # tensor shape: (1,3) -> (1,1)


m = Model()  # Model initialization

x = torch.randn(1, 3)  # Tensor inputs
inputs = [x]

output = m(x)
print(output)
"""
tensor([[-0.0931]], grad_fn=<addmmbackward0>)
"""
</addmmbackward0>
				
			

OK, now, let me explain the code in more detail. [Shaoyu: to write].

S2: Debugging PyTorch models

Sometimes, we may encounter some errors when running the code. Of course, it may be a potential bug in PyTorch, but the majority of the time, it is a buggy code itself (usually generated by DL fuzzers or typos). For example, look at the code below:

				
					class MatrixModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.mm_layer = torch.nn.Linear(10, 10)
    def forward(self, x1, inp):
	    # Constraint1: Row-column alignment
        v1 = torch.mm(x1, inp)
	    # Constraint2: Same shape or broadcastable
        v2 = v1 + inp
        return v2
        
func = MatrixModel()

# tensor shape definition does not satisfy constraints
x1 = torch.randn(2, 10)
inp = torch.randn(2, 10)
test_inputs = [x1, inp]
				
			

[shaoyu: to write]

S3: Conrrectness of torch.compile()

[shaoyu: to write]