link之家
链接快照平台
输入网页链接，自动生成快照
标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[4, 32, 6, 7] to have 3 channels, but got 32 channels instead

Ask Question
This is my implementation:
class Net(BaseFeaturesExtractor):
    def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 256):
        super(Net, self).__init__(observation_space, features_dim)
        n_input_channels = observation_space.shape[0]
        print("Observation space shape:"+str(observation_space.shape))
        print("Number of channels:" + str(n_input_channels))
        self.cnn = nn.Sequential(
            nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=2, padding=1),
            nn.ReLU(),
            nn.Flatten(),
            nn.Linear(in_features=128,out_features=64),
            nn.ReLU(),
            nn.Linear(in_features=64,out_features=7),
            nn.Sigmoid()
    def forward(self, observations: th.Tensor) -> th.Tensor:
        print("Observation shape:"+str(observations[0].shape))
        return self.cnn(observations)
When I tried to run the code which uses this CNN, I am getting following log:
    Observation space shape:(3, 6, 7) 
    Number of channels:3 
    Observation shape:torch.Size([3, 6, 7]) 
    Traceback (most recent call last):   File "/Users/joe/Documents/JUPYTER/ConnectX/training3.py", line 250, in <module>
        learner.learn(total_timesteps=iterations, callback=eval_callback)
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[4, 32, 6, 7] to have 3 channels, but got 32 channels instead
What is the problem here? How can I solve it?
in_channels of a conv layer should be equal to out_channels of the previous layer. In your case, in_channels of the 2nd and 3rd conv layers don't have the correct values. They should be like below,
    self.cnn = nn.Sequential(
        nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=1, padding=1),
        nn.ReLU(),
        nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1),
        nn.ReLU(),
        nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1),
        nn.ReLU(),
Also, you should check in_features of the 1st Linear layer. It depends on the input shape and should be equal to last_conv_out_channels * last_conv_output_height * last_conv_output_width.
For example, for an input=torch.randn(1, 3, 256, 256) last conv layer's output shape would be ([1, 32, 64, 64]), in that case the 1st Linear layer should be,
nn.Linear(in_features=32*64*64,out_features=64)
---- Update after the comment:
Output shape of a conv layer is calculated through the formula here (see under "Shape:" section). Using input = torch.randn(1, 3, 256, 256) as input to the network, here are outputs of each conv layer (I skipped the ReLUs since they don't change the shape),
conv1: (1, 3, 256, 256) -> (1, 32, 256, 256)
conv2: (1, 32, 256, 256) -> (1, 32, 128, 128)
conv3: (1, 32, 128, 128) -> (1, 32, 64, 64)
So how did last_conv_output_height and last_conv_output_width became 64 ? The last conv layer is defined as follows,
nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1)
Data is processed as (num_samples, num_channels, height, width) in PyTorch and the default value for dilation is stated as 1 in the conv2d doc. So, for the last conv layer, H_in is 128, padding[0] is 1, dilation[0] is 1, kernel_size[0] is 3 and stride[0] is 2. Therefore, height of its output becomes,
H_out = ⌊(128 + 2 * 1 - 1 * (3 - 1) - 1) / 2⌋ + 1
H_out = 64
Since square-size kernels and equal-size stride, padding and dilation are used, W_out also becomes 64 for the last conv layer.
I think the easiest way to compute in_features for the 1st Linear layer would be run the model for the desired size input until that layer. An example for your architecture,
inp = torch.randn(1, 3, 256, 256)
arch = nn.Sequential(
    nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1),
    nn.ReLU(),
    nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1)
outp = arch(inp)
print('outp.shape:', outp.shape)
This prints,
outp.shape: torch.Size([1, 32, 64, 64])
Finally, last_conv_out_channels is out_channels of the last conv layer. The last conv layer in your architecture is nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1). Here out_channels is the 2nd parameter, so last_conv_out_channels is 32.
                Thank you for you answer! Now I am getting this error: RuntimeError: mat1 and mat2 shapes cannot be multiplied (4x7 and 256x64). How can I know last_conv_out_channels, last_conv_output_height, last_conv_output_width?
– Joe Rakhimov
                Dec 25, 2020 at 11:16
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.