添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[4, 32, 6, 7] to have 3 channels, but got 32 channels instead

Ask Question

This is my implementation:

class Net(BaseFeaturesExtractor):
    def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 256):
        super(Net, self).__init__(observation_space, features_dim)
        n_input_channels = observation_space.shape[0]
        print("Observation space shape:"+str(observation_space.shape))
        print("Number of channels:" + str(n_input_channels))
        self.cnn = nn.Sequential(
            nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=2, padding=1),
            nn.ReLU(),
            nn.Flatten(),
            nn.Linear(in_features=128,out_features=64),
            nn.ReLU(),
            nn.Linear(in_features=64,out_features=7),
            nn.Sigmoid()
    def forward(self, observations: th.Tensor) -> th.Tensor:
        print("Observation shape:"+str(observations[0].shape))
        return self.cnn(observations)

When I tried to run the code which uses this CNN, I am getting following log:

    Observation space shape:(3, 6, 7) 
    Number of channels:3 
    Observation shape:torch.Size([3, 6, 7]) 
    Traceback (most recent call last):   File "/Users/joe/Documents/JUPYTER/ConnectX/training3.py", line 250, in <module>
        learner.learn(total_timesteps=iterations, callback=eval_callback)
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[4, 32, 6, 7] to have 3 channels, but got 32 channels instead

What is the problem here? How can I solve it?

in_channels of a conv layer should be equal to out_channels of the previous layer. In your case, in_channels of the 2nd and 3rd conv layers don't have the correct values. They should be like below,

    self.cnn = nn.Sequential(
        nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=1, padding=1),
        nn.ReLU(),
        nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1),
        nn.ReLU(),
        nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1),
        nn.ReLU(),

Also, you should check in_features of the 1st Linear layer. It depends on the input shape and should be equal to last_conv_out_channels * last_conv_output_height * last_conv_output_width.

For example, for an input=torch.randn(1, 3, 256, 256) last conv layer's output shape would be ([1, 32, 64, 64]), in that case the 1st Linear layer should be,

nn.Linear(in_features=32*64*64,out_features=64)

---- Update after the comment:

Output shape of a conv layer is calculated through the formula here (see under "Shape:" section). Using input = torch.randn(1, 3, 256, 256) as input to the network, here are outputs of each conv layer (I skipped the ReLUs since they don't change the shape),

conv1: (1, 3, 256, 256) -> (1, 32, 256, 256)
conv2: (1, 32, 256, 256) -> (1, 32, 128, 128)
conv3: (1, 32, 128, 128) -> (1, 32, 64, 64)

So how did last_conv_output_height and last_conv_output_width became 64 ? The last conv layer is defined as follows,

nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1)

Data is processed as (num_samples, num_channels, height​, width​) in PyTorch and the default value for dilation is stated as 1 in the conv2d doc. So, for the last conv layer, H_in is 128, padding[0] is 1, dilation[0] is 1, kernel_size[0] is 3 and stride[0] is 2. Therefore, height of its output becomes,

H_out = ⌊(128 + 2 * 1 - 1 * (3 - 1) - 1) / 2⌋ + 1
H_out = 64

Since square-size kernels and equal-size stride, padding and dilation are used, W_out also becomes 64 for the last conv layer.

I think the easiest way to compute in_features for the 1st Linear layer would be run the model for the desired size input until that layer. An example for your architecture,

inp = torch.randn(1, 3, 256, 256)
arch = nn.Sequential(
    nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1),
    nn.ReLU(),
    nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1)
outp = arch(inp)
print('outp.shape:', outp.shape)

This prints,

outp.shape: torch.Size([1, 32, 64, 64])

Finally, last_conv_out_channels is out_channels of the last conv layer. The last conv layer in your architecture is nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1). Here out_channels is the 2nd parameter, so last_conv_out_channels is 32.

Thank you for you answer! Now I am getting this error: RuntimeError: mat1 and mat2 shapes cannot be multiplied (4x7 and 256x64). How can I know last_conv_out_channels, last_conv_output_height, last_conv_output_width? – Joe Rakhimov Dec 25, 2020 at 11:16

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.