RuntimeError: “nll_loss_forward_reduce_cuda_kernel_2d_index“ not implemented for ‘Int‘

link之家

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'

Traceback (most recent call last):
  File "E:/MyWorkspace/EEG/Pytorch/Train.py", line 79, in <module>
    opti='Adam')
  File "E:\MyWorkspace\EEG\Pytorch\Utils.py", line 133, in TrainTest_Model
    validation_loss, validation_acc = Test_Model(net, testloader, criterion,True)
  File "E:\MyWorkspace\EEG\EEGLearn-Pytorch\Utils.py", line 82, in Test_Model
    loss = criterion(outputs, labels.cuda()) # GPU
  File "D:\coson\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\coson\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\loss.py", line 1166, in forward
    label_smoothing=self.label_smoothing)
  File "D:\coson\anaconda3\envs\pytorch\lib\site-packages\torch\nn\functional.py", line 3014, in cross_entropy
    return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'
Process finished with exit code 1

关键错误在 criterion(outputs, labels.cuda()) ，在本工程中criterion运行时给的值CrossEntropyLoss类实例，即： criterion = nn.CrossEntropyLoss() ，

因此该错误是在loss计算的时候发生的，原因就是类型不匹配，那个参数类的类型不匹配呢？(其实就是labels类型不匹配，

手动给的参数中，我这labels是int32即Int，所以很好判定，但如果不知道，怎么办，接着往下看)
看到 torch\nn\modules\loss.py 的1166行，即 label_smoothing=self.label_smoothing ,该1166行是函数调用部分代码，完整代码如下：

    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        return F.cross_entropy(input, target, weight=self.weight,
                               ignore_index=self.ignore_index, reduction=self.reduction,
                               label_smoothing=self.label_smoothing)

打开 cross_entropy 定义头，看到如下：

def cross_entropy(
    input: Tensor,
    target: Tensor,
    weight: Optional[Tensor] = None,
    size_average: Optional[bool] = None,
    ignore_index: int = -100,
    reduce: Optional[bool] = None,
    reduction: str = "mean",
    label_smoothing: float = 0.0,
) -> Tensor:

我们传入的参数对应 cross_entropy 的 input 和 target ，定义中没有指出input和target数值类型，到底是哪个参数不匹配，接着在 cross_entropy 函数往下看，发现其调用了C函数，芭比Q了，

torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

看不到这个C函数的底层实现，如何办，看看官方给函数说明和例子吧，峰回路转，发现在 cross_entropy 函数说明中有以下例子：

  # Example of target with class indices
  input = torch.randn(3, 5, requires_grad=True)
  target = torch.randint(5, (3,), dtype=torch.int64)
  loss = F.cross_entropy(input, target)
  loss.backward()
官方给的target用的int64，即long类型
所以可以断定`criterion(outputs, labels.cuda())`中的labels参数类型造成。
由上，我们可以对labels参数类型做转为：
```python
labels.long().cuda()

criterion(outputs, labels.long().cuda())

修改后，代码正常运行。